Re: [HACKERS] jsonb and nested hstore
On Fri, Mar 14, 2014 at 9:17 PM, Josh Berkus j...@agliodbs.com wrote: On 03/14/2014 06:44 PM, Tomas Vondra wrote: Stupid question - so if I have a json like this: Not a stupid question, actually. In fact, I expect to answer it 400 or 500 times over the lifespan of 9.4. { a : { b : c}} the GIN code indexes {b : c} as a single value? And then takes c and indexes it as a single value too? I don't know that c is indexed separately. Because otherwise I don't understand how the index could be used for queries with @ '{a : {b : c}}' conditions (i.e. path [a,b] with value c). H, if that's how it works, removing the size limit would be certainly more difficult than I thought. Precisely. Hence, the Russian plans for VODKA. Have these plans been shared publicly somewhere? Got a link? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore VODKA
On 03/31/2014 09:34 AM, Robert Haas wrote: On Fri, Mar 14, 2014 at 9:17 PM, Josh Berkus j...@agliodbs.com wrote: Precisely. Hence, the Russian plans for VODKA. Have these plans been shared publicly somewhere? Got a link? Nothing other than the pgCon proposal. Presumably we'll know at pgCon, unless one of them replies to this. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On 21.3.2014 08:23, Peter Geoghegan wrote: On Thu, Mar 13, 2014 at 3:39 PM, Peter Geoghegan p...@heroku.com wrote: On Thu, Mar 13, 2014 at 2:21 AM, Greg Stark st...@mit.edu wrote: It does sound like the main question here is which opclass should be the default. From the discussion there's a jsonb_hash_ops which works on all input values but supports fewer operators and a jsonb_ops which supports more operators but can't handle json with larger individual elements. Perhaps it's better to make jsonb_hash_ops the default so at least it's always safe to create a default gin index? Personally, I don't think it's a good idea to change the default. I must admit that I'm coming around to the view that jsonb_hash_ops would make a better default. Its performance is superb, and I think there's a strong case to be made for that more than making up for it not supporting all indexable operators - the existence operators just aren't that useful in comparison. I don't think that's how we should choose the default operator class. Wouldn't an operator class supporting wider range of functionality be a better fit, as we don't really know what are the users are going to do? You might be right that existence operators are used less frequently than conditions on values, how big the difference is? And do we gain something by using jsonb_hash_ops by default in the end? Say an application does one '?' query per 100 '@' queries. If the default opclass does not support '?' queries (forcing a seqscan), the total duration may easily be much higher than with the default opclass. I like that jsonb_hash_ops produces smaller indexes (~50% compared to jsonb_ops on the delicious dataset), and that it's faster (2-5x on the simple queries I've tried). But is that worth the risk? Keeping jsonb_ops as the default seems better / safer to me. regards Tomas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On Sun, Mar 23, 2014 at 11:10 AM, Tomas Vondra t...@fuzzy.cz wrote: Keeping jsonb_ops as the default seems better / safer to me. That's what I did. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On Thu, Mar 13, 2014 at 3:39 PM, Peter Geoghegan p...@heroku.com wrote: On Thu, Mar 13, 2014 at 2:21 AM, Greg Stark st...@mit.edu wrote: It does sound like the main question here is which opclass should be the default. From the discussion there's a jsonb_hash_ops which works on all input values but supports fewer operators and a jsonb_ops which supports more operators but can't handle json with larger individual elements. Perhaps it's better to make jsonb_hash_ops the default so at least it's always safe to create a default gin index? Personally, I don't think it's a good idea to change the default. I must admit that I'm coming around to the view that jsonb_hash_ops would make a better default. Its performance is superb, and I think there's a strong case to be made for that more than making up for it not supporting all indexable operators - the existence operators just aren't that useful in comparison. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On Fri, Mar 21, 2014 at 7:23 AM, Peter Geoghegan p...@heroku.com wrote: I must admit that I'm coming around to the view that jsonb_hash_ops would make a better default. Its performance is superb, and I think there's a strong case to be made for that more than making up for it not supporting all indexable operators - the existence operators just aren't that useful in comparison Is there any \d command that would display a nice list of which operators a given operator class actually supports? It's kind of hard to determine whether a proposed index would actually be useful for your queries without it. -- greg
Re: [HACKERS] jsonb and nested hstore
I've noticed two commits on github. commit b8199ee3c2506ab81b47a0b440363fc90c0d6956 Author: Peter Geoghegan p...@heroku.com Date: Wed Mar 19 02:02:16 2014 -0700 For jsonb_hash_ops, hash less By limiting the GIN entries to the least-nested level, the delicious.com sample JSON dataset index shrinks in size from 382MB to 255MB without any apparent downside. commit 2cea5213dba011625fc0d5c6b447e838080087b1 Author: Peter Geoghegan p...@heroku.com Date: Wed Mar 19 02:13:42 2014 -0700 Revert For jsonb_hash_ops, hash less This might be workable with another approach, but leave it for now. This reverts commit b8199ee3c2506ab81b47a0b440363fc90c0d6956. Besides implementation, what the idea was here? For me, it's impossible to skip any single element, because it's possible for query to include only this element. If we skip that element, we can't answer corresponding query no more. -- With best regards, Alexander Korotkov.
Re: [HACKERS] jsonb and nested hstore
On Thu, Mar 20, 2014 at 5:32 AM, Alexander Korotkov aekorot...@gmail.com wrote: Besides implementation, what the idea was here? For me, it's impossible to skip any single element, because it's possible for query to include only this element. If we skip that element, we can't answer corresponding query no more. This had something to do with an alternative notion of containment. I wouldn't have stuck with such a radical change without consulting you. I reverted it, and am not going to argue for the idea right now. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On Sat, Mar 15, 2014 at 1:44 AM, Tomas Vondra t...@fuzzy.cz wrote: Because otherwise I don't understand how the index could be used for queries with @ '{a : {b : c}}' conditions (i.e. path [a,b] with value c). Hm, some experimentation here shows it does indeed work for queries like this and works quite nicely. I agree, this contradicts my explanation so I'll need to poke in this some more to understand how it is that this works so well: explain select j-'tags'-'name' from osm where j @ '{tags:{waterway:dam}}' ; QUERY PLAN Bitmap Heap Scan on osm (cost=139.47..19565.07 rows=6125 width=95) Recheck Cond: (j @ '{tags: {waterway: dam}}'::jsonb) - Bitmap Index Scan on osmj (cost=0.00..137.94 rows=6125 width=0) Index Cond: (j @ '{tags: {waterway: dam}}'::jsonb) Planning time: 0.147 ms (5 rows) stark=# select j-'tags'-'name' from osm where j @ '{tags:{waterway:dam}}' ; ?column? - Alpine Dam Bell Canyon Dam Big Rock Dam Briones Dam Cascade Dam Gordon Valley Dam Kimball Canyon Dam Moore Dam Nicasio Dam Novato Creek Dam Ryland Dam Vasona Dam Warm Springs Dam Crystal Dam (248 rows) Time: 6.126 ms -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On 15.3.2014 06:40, Peter Geoghegan wrote: On Fri, Mar 14, 2014 at 6:44 PM, Tomas Vondra t...@fuzzy.cz wrote: Well, depends on how you define useful. With the sample dataset 'delicious' (see Peter's post) I can do this: SELECT doc FROM delicious WHERE doc @ '{title_detail : {value : TheaterMania}}'; with arbitrary paths, and I may create a GIN index to support such queries. And yes, it's much faster than GiST for example (by a factor of 1000). If you know ahead of time the entire nested value you can. So, if you attach some other data to the TheaterMania document, you had better know that too if you hope to write a query like this. You also have to index the entire table, where presumably with a little thought you could get away with a much smaller index. That strikes me as not very useful. Sure, I need to know some basic rules / do assumptions about the structure of the json document. In other words, schemaless databases are difficult to query. For example when storing mail message headers (i.e. the example I've used before), I do know that the json document is rather well structured - it's not nested at all, and all the values are either scalar values (mostly strings), or arrays of scalars. So it looks like this { from : john@example.com, to : [jane@example.com, jack@example.com], ... } So the schema is rather well defined (not the exact keys, but the structure certainly is). Let's say I want to allow arbitrary searches on headers - I can't support that with expression indexes, because there's like a zillion of possible headers and I'd have to create an expression index on each of them separately. But I can support that with a single GIN index ... Yes, the GIN index is quite large (~560MB for a ~1.2GB table). With the default opclass, without an expressional index, 100% of the data from the table appears in the index. Why do you think that's quite large? That wasn't meant as a complaint. I have no problem with the index size (If we can make it smaller in the future, great! But I can live with the current index sizes too.) regards Tomas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On 15.3.2014 02:15, Peter Geoghegan wrote: On Fri, Mar 14, 2014 at 5:10 PM, Tomas Vondra t...@fuzzy.cz wrote: I'm on commit a3115f0d, which is just 2 days old, so I suppose this was not fixed yet. Try merging the feature branch now, which will get you commit 16923d, which you're missing. That was an open item for a while, which I only got around to fixing a few days ago. Ok, that seems to be working fine. T. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On 13 Březen 2014, 23:39, Peter Geoghegan wrote: On Thu, Mar 13, 2014 at 2:21 AM, Greg Stark st...@mit.edu wrote: It does sound like the main question here is which opclass should be the default. From the discussion there's a jsonb_hash_ops which works on all input values but supports fewer operators and a jsonb_ops which supports more operators but can't handle json with larger individual elements. Perhaps it's better to make jsonb_hash_ops the default so at least it's always safe to create a default gin index? Personally, I don't think it's a good idea to change the default. I have yet to be convinced that if you hit the GIN limitation it's an indication of anything other than that you need to reconsider your indexing choices (how often have we heard that complaint of GIN before in practice?). Even if you don't hit the limitation directly, with I've never used GIN with anything else than values that built-in full-text (tsvector), pg_trgm or points, and I suspect that's the case with most other users. All those types have naturally limited size (e.g. words tend to have very limited length, unless you're Maori, but even there the longest name is just 85 characters [1]). The only place in (core|contrib) where I'd expect this kind of issues is probably intarray, but it's arguably less frequently used than tsvector/pg_trgm for example. So ISTM this is the main reason why we don't see more complaints about the GIN size limit. I expect that to change with json + index all approach. something like jsonb_hash_ops you're still hashing a large nested structure, very probably uselessly. Are you really going to look for an exact match to an elaborate nested structure? I would think, probably not. What I find (very) useful is queries that look like this: SELECT if FROM json_table WHERE json_value @ '{a : {b : {c : 3}}}'; or (without the @ operator) like this: SELECT if FROM json_table WHERE json_value # ARRAY['a', 'b', 'c'] = '3'; or something like that ... Now, as Alexander says, there might be a role for another (jsonb_hash_ops) opclass that separately indexes values only. I still think that by far the simplest solution is to use expressional indexes, because we index key values and array element values indifferently. Of course, nothing we have here precludes the development of such an opclass. Maybe. I don't have much insight into ho GIN works / what is possible. But I think we should avoid having large number of opclasses, each supporting a small fraction of use cases. If we could keep the two we have right now, that'd be nice. regards Tomas [1] http://en.wikipedia.org/wiki/List_of_long_place_names -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
VODKA index will have no lenght limitation. On Fri, Mar 14, 2014 at 3:07 PM, Tomas Vondra t...@fuzzy.cz wrote: On 13 Březen 2014, 23:39, Peter Geoghegan wrote: On Thu, Mar 13, 2014 at 2:21 AM, Greg Stark st...@mit.edu wrote: It does sound like the main question here is which opclass should be the default. From the discussion there's a jsonb_hash_ops which works on all input values but supports fewer operators and a jsonb_ops which supports more operators but can't handle json with larger individual elements. Perhaps it's better to make jsonb_hash_ops the default so at least it's always safe to create a default gin index? Personally, I don't think it's a good idea to change the default. I have yet to be convinced that if you hit the GIN limitation it's an indication of anything other than that you need to reconsider your indexing choices (how often have we heard that complaint of GIN before in practice?). Even if you don't hit the limitation directly, with I've never used GIN with anything else than values that built-in full-text (tsvector), pg_trgm or points, and I suspect that's the case with most other users. All those types have naturally limited size (e.g. words tend to have very limited length, unless you're Maori, but even there the longest name is just 85 characters [1]). The only place in (core|contrib) where I'd expect this kind of issues is probably intarray, but it's arguably less frequently used than tsvector/pg_trgm for example. So ISTM this is the main reason why we don't see more complaints about the GIN size limit. I expect that to change with json + index all approach. something like jsonb_hash_ops you're still hashing a large nested structure, very probably uselessly. Are you really going to look for an exact match to an elaborate nested structure? I would think, probably not. What I find (very) useful is queries that look like this: SELECT if FROM json_table WHERE json_value @ '{a : {b : {c : 3}}}'; or (without the @ operator) like this: SELECT if FROM json_table WHERE json_value # ARRAY['a', 'b', 'c'] = '3'; or something like that ... Now, as Alexander says, there might be a role for another (jsonb_hash_ops) opclass that separately indexes values only. I still think that by far the simplest solution is to use expressional indexes, because we index key values and array element values indifferently. Of course, nothing we have here precludes the development of such an opclass. Maybe. I don't have much insight into ho GIN works / what is possible. But I think we should avoid having large number of opclasses, each supporting a small fraction of use cases. If we could keep the two we have right now, that'd be nice. regards Tomas [1] http://en.wikipedia.org/wiki/List_of_long_place_names -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On 03/14/2014 04:52 AM, Oleg Bartunov wrote: VODKA index will have no lenght limitation. Yeah, so I think we go with what we have, and tell people if you're hitting these length issues, wait for 9.5, where they will be fixed. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
9.5 may too optimistic :) On Fri, Mar 14, 2014 at 11:18 PM, Josh Berkus j...@agliodbs.com wrote: On 03/14/2014 04:52 AM, Oleg Bartunov wrote: VODKA index will have no lenght limitation. Yeah, so I think we go with what we have, and tell people if you're hitting these length issues, wait for 9.5, where they will be fixed. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On 15/03/14 08:45, Oleg Bartunov wrote: 9.5 may too optimistic :) On Fri, Mar 14, 2014 at 11:18 PM, Josh Berkus j...@agliodbs.com wrote: On 03/14/2014 04:52 AM, Oleg Bartunov wrote: VODKA index will have no lenght limitation. Yeah, so I think we go with what we have, and tell people if you're hitting these length issues, wait for 9.5, where they will be fixed. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com No tell them to wait for Postgres 12.3.42 - the version that is totally bug free implements parallel processing of individual queries! :-) (With apologies to Douglas Adams) Cheers, Gavin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On 03/14/2014 12:45 PM, Oleg Bartunov wrote: 9.5 may too optimistic :) Nonsense, you, Teodor and Alexander are geniuses. It can't possibly take you more than a year. ;-) -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On 14.3.2014 20:18, Josh Berkus wrote: On 03/14/2014 04:52 AM, Oleg Bartunov wrote: VODKA index will have no lenght limitation. Yeah, so I think we go with what we have, and tell people if you're hitting these length issues, wait for 9.5, where they will be fixed. VODKA may be great, but I haven't seen a single line of code for that yet. And given the response from Oleg, 9.5 seems ambitious. I'm not awfully familiar with the GIN code, but based on Alexander's feedback I presume fixing the GIN length limit (or rather removing it, as it's a feature, not a bug) is quite straightforward. Why not to at least consider that for 9.4, unless it turns more complex than expected? Don't get me wrong - I'm aware it's quite late in the last commitfest, and if it's deemed unacceptable / endandering 9.4 release, I'm not going to say a word. But if it's a simple patch ... regards Tomas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On Fri, Mar 14, 2014 at 2:21 PM, Tomas Vondra t...@fuzzy.cz wrote: I'm not awfully familiar with the GIN code, but based on Alexander's feedback I presume fixing the GIN length limit (or rather removing it, as it's a feature, not a bug) is quite straightforward. Why not to at least consider that for 9.4, unless it turns more complex than expected? Alexander said nothing about removing that limitation, or if he did I missed it. Which, as I said, I don't consider to be much of a limitation, because indexing the whole nested value doesn't mean it can satisfy a query on some more nested subset of an indexed value datum (i.e. a value in the sense of a value in a key/value pair). Alexander mentioned just indexing keys (object keys, or equivalently array elements at the jsonb level), which is a reasonable thing, but can be worked on later. I don't have much interest in working on making it possible to index elaborate nested values in key/value pairs, which is what you're suggesting if I've understood correctly. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
For the benefit of anyone that would like to try the patch out, I make available a custom format dump of some delicious sample data. I can query the sample data as follows on my local installation: [local]/jsondata=# select count(*) from delicious ; count - 1079399 (1 row) [local]/jsondata=# \dt+ delicious List of relations Schema | Name| Type | Owner | Size | Description +---+---+---+-+- public | delicious | table | pg| 1174 MB | (1 row) It's available from: http://postgres-benchmarks.s3-website-us-east-1.amazonaws.com/jsondata.dump -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On 14.3.2014 22:54, Peter Geoghegan wrote: On Fri, Mar 14, 2014 at 2:21 PM, Tomas Vondra t...@fuzzy.cz wrote: I'm not awfully familiar with the GIN code, but based on Alexander's feedback I presume fixing the GIN length limit (or rather removing it, as it's a feature, not a bug) is quite straightforward. Why not to at least consider that for 9.4, unless it turns more complex than expected? Alexander said nothing about removing that limitation, or if he did I missed it. Which, as I said, I don't consider to be much of a Sure he did, see this: http://www.postgresql.org/message-id/capphfds4xmg5zop+1ctrrqnm6wxhh2a7j11nnjeosa76uow...@mail.gmail.com Although it doesn't mention how complex change it would be. limitation, because indexing the whole nested value doesn't mean it can satisfy a query on some more nested subset of an indexed value datum (i.e. a value in the sense of a value in a key/value pair). OK, I'm getting lost in the nested stuff. The trouble I'm running into are rather unlerated to nesting. For example indexing this fails if the string is sufficiently long (~1350B if random, more if compressible). {key : ... string ...} How's that related to nesting? Anyway, I'm not talking about exact matches on subtrees. I'm talking about queries like this: SELECT doc FROM delicious WHERE doc @ '{title_detail : {value : TheaterMania}}'; which does exactly the same thing like this query: SELECT doc FROM delicious WHERE doc-'title_detail'-'value' = 'TheaterMania'; Except that the first query can use a GIN index created like this: CREATE INDEX delicious_idx ON delicious USING GIN (doc); while the latter does sequential scan. It can use a GiST index too, but it takes 140ms with GiST and only ~0.3ms with GIN. Big difference. Alexander mentioned just indexing keys (object keys, or equivalently array elements at the jsonb level), which is a reasonable thing, but can be worked on later. I don't have much interest in working on making it possible to index elaborate nested values in key/value pairs, which is what you're suggesting if I've understood correctly. I never asked for indexing elaborate nested values in key/value pairs. All I'm asking for is indexing of json values containing long strings. regards Tomas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On 14.3.2014 23:06, Peter Geoghegan wrote: For the benefit of anyone that would like to try the patch out, I make available a custom format dump of some delicious sample data. I can query the sample data as follows on my local installation: [local]/jsondata=# select count(*) from delicious ; count - 1079399 (1 row) [local]/jsondata=# \dt+ delicious List of relations Schema | Name| Type | Owner | Size | Description +---+---+---+-+- public | delicious | table | pg| 1174 MB | (1 row) It's available from: http://postgres-benchmarks.s3-website-us-east-1.amazonaws.com/jsondata.dump Thanks. I've been doing some simple queries on this dataset and ISTM there's a memory leak somewhere in the json code (i.e. something is probably using a wrong memory context), because this query: SELECT doc-'title_detail'-'value', COUNT(*) FROM delicious GROUP BY 1; results in this: PID USER PR NIVIRTRESSHR S %CPU %MEM TIME+ COMMAND 8231 tomas 20 0 5987520 4,645g 6136 R 95,4 60,4 0:37.54 postgres: tomas delicious [local] I have shared_buffers=1GB and work_mem=64MB, so 4.6GB seems a bit too much 4.6GB. Actually it grows even further, and then OOM jumps in and kills the backend like this: [ 9227.318998] Out of memory: Kill process 8159 (postgres) score 595 or sacrifice child [ 9227.319000] Killed process 8159 (postgres) total-vm:5920272kB, anon-rss:4791568kB, file-rss:6192kB I'm on commit a3115f0d, which is just 2 days old, so I suppose this was not fixed yet. regards Tomas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On 2014-03-14 22:21:18 +0100, Tomas Vondra wrote: Don't get me wrong - I'm aware it's quite late in the last commitfest, and if it's deemed unacceptable / endandering 9.4 release, I'm not going to say a word. But if it's a simple patch ... IMNSHO there's no bloody chance for such an addition at this point of the cycle. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On Fri, Mar 14, 2014 at 9:21 PM, Tomas Vondra t...@fuzzy.cz wrote: I'm not awfully familiar with the GIN code, but based on Alexander's feedback I presume fixing the GIN length limit (or rather removing it, as it's a feature, not a bug) is quite straightforward. Why not to at least consider that for 9.4, unless it turns more complex than expected? Don't get me wrong - I'm aware it's quite late in the last commitfest, and if it's deemed unacceptable / endandering 9.4 release, I'm not going to say a word. But if it's a simple patch ... Well I think the bigger picture is that the cases were we're getting this error it's because we're expecting too much from the GIN opclass. It's trying to index entire json objects as individual values which isn't really very useful. We're unlikely to go querying for rows where the value of a given key is a specific json object. As I understand it Peter's right that in its current form the GIN opclass is only useful if you use it on an expression index on specific pieces of your json which are traditional non-nested hash tables. Or I suppose if you're really only concerned with the ? operator which looks for keys, which is pretty common too. I had in mind that the GIN opclass would do something clever like decompose the json into all the path-value tuples so I could do arbitrary path lookups for values. That might be possible in the future but it's not what we have today and what we have today is already better than hstore. I think we're better off committing this and moving forward with the contrib hstore2 wrapper which uses this infrastructure so people have a migration path. I don't think Josh is right to say it'll be fixed in 9.5. It'll be better in 9.5 because we have ambitious plans to continue improving in this direction. But it'll be even better in 9.6 and better again in 9.7. It'll never be fixed. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On Fri, Mar 14, 2014 at 5:10 PM, Tomas Vondra t...@fuzzy.cz wrote: I'm on commit a3115f0d, which is just 2 days old, so I suppose this was not fixed yet. Try merging the feature branch now, which will get you commit 16923d, which you're missing. That was an open item for a while, which I only got around to fixing a few days ago. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On 15.3.2014 02:03, Greg Stark wrote: On Fri, Mar 14, 2014 at 9:21 PM, Tomas Vondra t...@fuzzy.cz wrote: I'm not awfully familiar with the GIN code, but based on Alexander's feedback I presume fixing the GIN length limit (or rather removing it, as it's a feature, not a bug) is quite straightforward. Why not to at least consider that for 9.4, unless it turns more complex than expected? Don't get me wrong - I'm aware it's quite late in the last commitfest, and if it's deemed unacceptable / endandering 9.4 release, I'm not going to say a word. But if it's a simple patch ... Well I think the bigger picture is that the cases were we're getting this error it's because we're expecting too much from the GIN opclass. It's trying to index entire json objects as individual values which isn't really very useful. We're unlikely to go querying for rows where the value of a given key is a specific json object. Stupid question - so if I have a json like this: { a : { b : c}} the GIN code indexes {b : c} as a single value? And then takes c and indexes it as a single value too? Because otherwise I don't understand how the index could be used for queries with @ '{a : {b : c}}' conditions (i.e. path [a,b] with value c). H, if that's how it works, removing the size limit would be certainly more difficult than I thought. As I understand it Peter's right that in its current form the GIN opclass is only useful if you use it on an expression index on specific pieces of your json which are traditional non-nested hash tables. Or I suppose if you're really only concerned with the ? operator which looks for keys, which is pretty common too. Well, depends on how you define useful. With the sample dataset 'delicious' (see Peter's post) I can do this: SELECT doc FROM delicious WHERE doc @ '{title_detail : {value : TheaterMania}}'; with arbitrary paths, and I may create a GIN index to support such queries. And yes, it's much faster than GiST for example (by a factor of 1000). Yes, the GIN index is quite large (~560MB for a ~1.2GB table). I had in mind that the GIN opclass would do something clever like decompose the json into all the path-value tuples so I could do arbitrary path lookups for values. That might be possible in the future but it's not what we have today and what we have today is already better than hstore. I think we're better off committing this and moving forward with the contrib hstore2 wrapper which uses this infrastructure so people have a migration path. Yes, it's better than hstore - no doubt about that. The hierarchy and data types are great, and hstore has the same size limitation. I don't think Josh is right to say it'll be fixed in 9.5. It'll be better in 9.5 because we have ambitious plans to continue improving in this direction. But it'll be even better in 9.6 and better again in 9.7. It'll never be fixed. I don't dare to say what will be in 9.5 (not even thinking about the following versions). Assuming the GIN will remain for 9.4 as it is now (both opclasses), it would be nice if we could improve this in 9.5. I can live with custom opclasses in an extension, if there are some ... regards Tomas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On 03/14/2014 06:44 PM, Tomas Vondra wrote: Stupid question - so if I have a json like this: Not a stupid question, actually. In fact, I expect to answer it 400 or 500 times over the lifespan of 9.4. { a : { b : c}} the GIN code indexes {b : c} as a single value? And then takes c and indexes it as a single value too? I don't know that c is indexed separately. Because otherwise I don't understand how the index could be used for queries with @ '{a : {b : c}}' conditions (i.e. path [a,b] with value c). H, if that's how it works, removing the size limit would be certainly more difficult than I thought. Precisely. Hence, the Russian plans for VODKA. Well, depends on how you define useful. With the sample dataset 'delicious' (see Peter's post) I can do this: SELECT doc FROM delicious WHERE doc @ '{title_detail : {value : TheaterMania}}'; with arbitrary paths, and I may create a GIN index to support such queries. And yes, it's much faster than GiST for example (by a factor of 1000). Yes, the GIN index is quite large (~560MB for a ~1.2GB table). State of the art, actually. In MongoDB, the indexes are frequently several times larger than the raw data. So if ours are 50% the size, we're doing pretty good. On 15.3.2014 02:03, Greg Stark wrote: I don't think Josh is right to say it'll be fixed in 9.5. It'll be better in 9.5 because we have ambitious plans to continue improving in this direction. But it'll be even better in 9.6 and better again in 9.7. It'll never be fixed. Oh, no doubt. The important thing is that 9.4 will significantly broaden the class of applications for which our JSON support is useful, and allow us to remain relevant to an increasingly NoSQLish developer base. We're both showing progress and delivering features which are actually useful, even if they still have major limitations. Plus, you know, those features are useful to *me*, so I'm keen on them personally. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On Fri, Mar 14, 2014 at 6:44 PM, Tomas Vondra t...@fuzzy.cz wrote: Well, depends on how you define useful. With the sample dataset 'delicious' (see Peter's post) I can do this: SELECT doc FROM delicious WHERE doc @ '{title_detail : {value : TheaterMania}}'; with arbitrary paths, and I may create a GIN index to support such queries. And yes, it's much faster than GiST for example (by a factor of 1000). If you know ahead of time the entire nested value you can. So, if you attach some other data to the TheaterMania document, you had better know that too if you hope to write a query like this. You also have to index the entire table, where presumably with a little thought you could get away with a much smaller index. That strikes me as not very useful. Yes, the GIN index is quite large (~560MB for a ~1.2GB table). With the default opclass, without an expressional index, 100% of the data from the table appears in the index. Why do you think that's quite large? -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On Wed, Mar 12, 2014 at 01:58:14PM -0700, Peter Geoghegan wrote: The use case you describe here doesn't sound like something similar to full text search. It sounds like something identical. In any case, let's focus on what we have right now. I think that the indexing facilities proposed here are solid. In any case they do not preclude working on better indexing strategies as the need emerges. Keep in mind that if we ship an index format, we are going to have trouble changing the layout because of pg_upgrade. pg_upgrade can mark the indexes as invalid and force users to reindex, but that is less than idea. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On Thu, Mar 13, 2014 at 6:15 AM, Bruce Momjian br...@momjian.us wrote: On Wed, Mar 12, 2014 at 01:58:14PM -0700, Peter Geoghegan wrote: The use case you describe here doesn't sound like something similar to full text search. It sounds like something identical. In any case, let's focus on what we have right now. I think that the indexing facilities proposed here are solid. In any case they do not preclude working on better indexing strategies as the need emerges. Keep in mind that if we ship an index format, we are going to have trouble changing the layout because of pg_upgrade. pg_upgrade can mark the indexes as invalid and force users to reindex, but that is less than idea. Well these are just normal gin and gist indexes. If we want to come up with new index operator classess we can still do that and keep the old ones if necessary. Even that seems pretty unlikely from past experience. I'm actually pretty sanguine even about keeping the GIST opclass. If it has bugs then the bugs only affect people who use this non-default opclass and we can fix them. It doesn't risk questioning any basic design choices in the patch. It does sound like the main question here is which opclass should be the default. From the discussion there's a jsonb_hash_ops which works on all input values but supports fewer operators and a jsonb_ops which supports more operators but can't handle json with larger individual elements. Perhaps it's better to make jsonb_hash_ops the default so at least it's always safe to create a default gin index? -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
Fwiw I have a few questions -- but beware, I'm a complete neophyte when it comes to jsonb style document databases so these are more likely to represent misconceptions on my part than problems with jsonb. I naively though a gin index on a jsonb would help with queries like WHERE col-'prop' = 'val'. In fact it only seems to help with WHERE col ? 'prop'. To help with the former it looks like I need an expression index on col-'prop' is that right? There doesn't seem to be an operator that combines both a dereference and value test into a single operator so I don't think our index machinery can deal with this. Or am I supposed to use contains and construct a json object for the test? I also find it awkward that col-'prop' returns the json representation of the property. If it's text that means it's double-quoted. I would think that a user storing text in a json property would want a way to pull out the text that json property represents so he doesn't have to write col-'prop' = 'foo' and doesn't need to strip the quotes (and de-escape the string?) before displaying the value or passing it through other apis. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On Thu, Mar 13, 2014 at 1:21 PM, Greg Stark st...@mit.edu wrote: Well these are just normal gin and gist indexes. If we want to come up with new index operator classess we can still do that and keep the old ones if necessary. Even that seems pretty unlikely from past experience. I'm actually pretty sanguine even about keeping the GIST opclass. If it has bugs then the bugs only affect people who use this non-default opclass and we can fix them. It doesn't risk questioning any basic design choices in the patch. It does sound like the main question here is which opclass should be the default. From the discussion there's a jsonb_hash_ops which works on all input values but supports fewer operators and a jsonb_ops which supports more operators but can't handle json with larger individual elements. Perhaps it's better to make jsonb_hash_ops the default so at least it's always safe to create a default gin index? A couple of thoughts from me: 1) We can evade length limitation if GIN index by truncating long values and setting recheck flag. We can introduce some indicator of truncated value like zero byte at the end. 2) jsonb_hash_ops can be extended to handle keys queries too. We can preserve one bit in hash as flag indicating whether it's a hash of key or hash of path to value. For sure, such index would be a bit larger. Also, jsonb_hash_ops can be split into two: with and without keys. -- With best regards, Alexander Korotkov.
Re: [HACKERS] jsonb and nested hstore
On Thu, Mar 13, 2014 at 4:21 PM, Alexander Korotkov aekorot...@gmail.com wrote: On Thu, Mar 13, 2014 at 1:21 PM, Greg Stark st...@mit.edu wrote: Well these are just normal gin and gist indexes. If we want to come up with new index operator classess we can still do that and keep the old ones if necessary. Even that seems pretty unlikely from past experience. I'm actually pretty sanguine even about keeping the GIST opclass. If it has bugs then the bugs only affect people who use this non-default opclass and we can fix them. It doesn't risk questioning any basic design choices in the patch. It does sound like the main question here is which opclass should be the default. From the discussion there's a jsonb_hash_ops which works on all input values but supports fewer operators and a jsonb_ops which supports more operators but can't handle json with larger individual elements. Perhaps it's better to make jsonb_hash_ops the default so at least it's always safe to create a default gin index? A couple of thoughts from me: 1) We can evade length limitation if GIN index by truncating long values and setting recheck flag. We can introduce some indicator of truncated value like zero byte at the end. 2) jsonb_hash_ops can be extended to handle keys queries too. We can preserve one bit in hash as flag indicating whether it's a hash of key or hash of path to value. For sure, such index would be a bit larger. Also, jsonb_hash_ops can be split into two: with and without keys. That's right ! Should we do these now, that's the question. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
Fwiw the jsonb data doesn't actually seem to be any smaller than text json on this data set (this is avg(pg_column_size(col)) and I checked, they're both using the same amount of toast space) jsonb | json ---+--- 813.5 | 716.3 (1 row) It's still more than 7x faster in cpu costs though: stark=# select count(attrs-'properties'-'STREET') from citylots; count 196507 (1 row) Time: 1026.678 ms stark=# select count(attrs-'properties'-'STREET') from citylots_json; count 196507 (1 row) Time: 7418.010 ms -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On 03/13/2014 06:53 AM, Greg Stark wrote: I also find it awkward that col-'prop' returns the json representation of the property. If it's text that means it's double-quoted. I would think that a user storing text in a json property would want a way to pull out the text that json property represents so he doesn't have to write col-'prop' = 'foo' and doesn't need to strip the quotes (and de-escape the string?) before displaying the value or passing it through other apis. - returns dequoted text if the value it points to is a plain string. If it's not doing that then that's a bug. andrew=# select jsonb '{a:the string}' - 'a'; ?column? -- the string (1 row) andrew=# select jsonb '{a:the string}' - 'a' ; ?column? the string (1 row) cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On 03/13/2014 08:42 AM, Greg Stark wrote: Fwiw the jsonb data doesn't actually seem to be any smaller than text json on this data set (this is avg(pg_column_size(col)) and I checked, they're both using the same amount of toast space) jsonb | json ---+--- 813.5 | 716.3 (1 row) That's expected, you save on whitespace, quotes and punctuation and spend on structural overhead (e.g. string lengths). The actual strings stored are the virtally the same. Numbers are stored as numerics, which might or might not be longer. Nulls and booleans are about a wash. It's still more than 7x faster in cpu costs though: stark=# select count(attrs-'properties'-'STREET') from citylots; count 196507 (1 row) Time: 1026.678 ms stark=# select count(attrs-'properties'-'STREET') from citylots_json; count 196507 (1 row) Time: 7418.010 ms That's also expected, it's one of the major benefits. With jsonb you're avoiding reparsing the json. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On Thu, Mar 13, 2014 at 1:08 PM, Andrew Dunstan and...@dunslane.net wrote: - returns dequoted text if the value it points to is a plain string. If it's not doing that then that's a bug. Sorry, I must have gotten confused between various tests. It does seem to be doing that. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
Another question. Is Peter's branch up to date with jsonb_populate_record() ? From discussions on list it sounds like the plan was to get rid of the use_json_as_text argument but his patch still has it. (Tangentially, I wonder if it wouldn't be possible to make this a plain cast. I'm not sure but I think it's possible to have a cast to a polymorphic type and peek at runtime at the record definition to determine what to do). -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On 03/13/2014 10:49 AM, Greg Stark wrote: Another question. Is Peter's branch up to date with jsonb_populate_record() ? From discussions on list it sounds like the plan was to get rid of the use_json_as_text argument but his patch still has it. Yes, we're not changing that, and some people like it anyway. The API is intentionally the same as the legacy json_populate_record API. (Tangentially, I wonder if it wouldn't be possible to make this a plain cast. I'm not sure but I think it's possible to have a cast to a polymorphic type and peek at runtime at the record definition to determine what to do). If you can simplify it be my guest. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On 13.3.2014 13:28, Oleg Bartunov wrote: On Thu, Mar 13, 2014 at 4:21 PM, Alexander Korotkov aekorot...@gmail.com wrote: On Thu, Mar 13, 2014 at 1:21 PM, Greg Stark st...@mit.edu wrote: Well these are just normal gin and gist indexes. If we want to come up with new index operator classess we can still do that and keep the old ones if necessary. Even that seems pretty unlikely from past experience. I'm actually pretty sanguine even about keeping the GIST opclass. If it has bugs then the bugs only affect people who use this non-default opclass and we can fix them. It doesn't risk questioning any basic design choices in the patch. It does sound like the main question here is which opclass should be the default. From the discussion there's a jsonb_hash_ops which works on all input values but supports fewer operators and a jsonb_ops which supports more operators but can't handle json with larger individual elements. Perhaps it's better to make jsonb_hash_ops the default so at least it's always safe to create a default gin index? A couple of thoughts from me: 1) We can evade length limitation if GIN index by truncating long values and setting recheck flag. We can introduce some indicator of truncated value like zero byte at the end. 2) jsonb_hash_ops can be extended to handle keys queries too. We can preserve one bit in hash as flag indicating whether it's a hash of key or hash of path to value. For sure, such index would be a bit larger. Also, jsonb_hash_ops can be split into two: with and without keys. That's right ! Should we do these now, that's the question. Yeah, those are basically the two solutions I proposed a few messages back in this thread. I'm pleased I haven't proposed a complete nonsense. The question whether do that now or wait for 9.5 is a tough one. Doing both for 9.4 is certainly stretching the commitfest to it's limits :-( My impression is that while (2) means rather significant implementation changes in jsonb_hash_ops, (1) is rather straightforward. Is that correct (e.g. how's the truncation going to work with arrays?). If that's true, I'd like propose doing (1) for 9.4 and leaving (2) to 9.5. I'm ready to spend non-trivial amount of time testing the changes required in (1). regards Tomas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On Mon, Mar 10, 2014 at 4:18 AM, Peter Geoghegan p...@heroku.com wrote: * Extensive additional documentation. References to the very new JSON RFC. I think that this revision is in general a lot more coherent, and I found that reflecting on what idiomatic usage should look like while writing the documentation brought clarity to my thoughts on how the code should be structured. The documentation is worth a read if you want to get a better sense of what the patch is about relatively quickly. The attached documentation is excellent -- wow. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On Thu, Mar 13, 2014 at 2:21 AM, Greg Stark st...@mit.edu wrote: It does sound like the main question here is which opclass should be the default. From the discussion there's a jsonb_hash_ops which works on all input values but supports fewer operators and a jsonb_ops which supports more operators but can't handle json with larger individual elements. Perhaps it's better to make jsonb_hash_ops the default so at least it's always safe to create a default gin index? Personally, I don't think it's a good idea to change the default. I have yet to be convinced that if you hit the GIN limitation it's an indication of anything other than that you need to reconsider your indexing choices (how often have we heard that complaint of GIN before in practice?). Even if you don't hit the limitation directly, with something like jsonb_hash_ops you're still hashing a large nested structure, very probably uselessly. Are you really going to look for an exact match to an elaborate nested structure? I would think, probably not. Now, as Alexander says, there might be a role for another (jsonb_hash_ops) opclass that separately indexes values only. I still think that by far the simplest solution is to use expressional indexes, because we index key values and array element values indifferently. Of course, nothing we have here precludes the development of such an opclass. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On 12 Březen 2014, 0:41, Peter Geoghegan wrote: On Tue, Mar 11, 2014 at 3:58 PM, Tomas Vondra t...@fuzzy.cz wrote: ERROR: index row size 1416 exceeds maximum 1352 for index gin_idx All index AMs have similar restrictions. Yes, I know and I have no problem with restrictions in general. You may run into similar issues with btree indexes on text columns with long text, for example. The thing is that people don't generally index text directly, because it usually does not make much sense, but using tsvector etc. But with jsonb it's more likely because indexing is one of the goodies (at least for me). And the discussions with several people interested in storing json data I had recently went often like this: me: It seems we'll have a better json datatype in 9.4. them: Nice! me: And it will be possible to do searches on arbitrary keys. them: Yay! me: And we actually got pretty significant improvements in GIN indexes. them: Awesome! me: But the values you may index need to be less than ~1500B. them: Bummer :-( me: Well, you can use GIST then. A good example of such header is dkim-signature which basically contains the whole message digitally signed with DKIM. The signature tends to be long and non-compressible, thanks to the signature. I'm wondering what's the best way around this, because I suspect many new users (especially those attracted by jsonb and GIN improvements) will run into this. Maybe not immediately, but eventully they'll try to insert a jsonb with long value, and it will fail ... The jsonb_hash_ops operator class just stores a 32-bit integer hash value (it always sets the recheck flag, which only some of the other default GIN opclass' strategies do). It only supports containment, and not the full variety of operators that the default opclass supports, which is why it isn't the default. I think that in practice the general recommendation will be that when indexing at the top level, use jsonb_hash_ops. When indexing nested items, use the more flexible default GIN opclass. That seems like a pretty smart trade-off to me. OK, I'll look into the jsonb_hash_ops - that sounds more or less like what I was thinking about (and sure, storing hashes makes some operations impossible to support). The other thing I was thinking about is introducing some kind of upper limit for the value length - e.g. index just the first 1kB, or something like that. My experience is most values are way shorter, or actually differ in the first 1kB, so this should allow most decisions to be made. But I'm not really that familiar with how GIN works, so maybe this is nonsense. The more I think about it, the more inclined I am to lose GiST support entirely for the time being. It lets us throw out about 700 lines of C code, which is a very significant fraction of the total, removes the one open bug, and removes the least understood part of the code. The GiST opclass is not particularly compelling for this. I disagree with that. I see GiST as a simple fallback option for the cases I described. I wasn't able to create a GIN index because of exceeding the max item length, but GiST created just fine. It was considerably slower, but it worked. Tomas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On 12 Březen 2014, 0:51, Peter Geoghegan wrote: On Tue, Mar 11, 2014 at 4:41 PM, Peter Geoghegan p...@heroku.com wrote: I think that in practice the general recommendation will be that when indexing at the top level, use jsonb_hash_ops. When indexing nested items, use the more flexible default GIN opclass. That seems like a pretty smart trade-off to me. By which I mean: index nested items using an expressional GIN index. I'm still not sure how would that look. Does that mean I'd have to create multiple GIN indexes - one for each possible key or something like that? Can you give an example? regards Tomas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
Also, GiST index is faster for create/update operations. I really hope we will improve jsonb indexing in the next one-two releases. For now I'd suggest people index expressional indexes to index just interesting keys or use GiST. On Wed, Mar 12, 2014 at 5:15 PM, Tomas Vondra t...@fuzzy.cz wrote: On 12 Březen 2014, 0:41, Peter Geoghegan wrote: On Tue, Mar 11, 2014 at 3:58 PM, Tomas Vondra t...@fuzzy.cz wrote: ERROR: index row size 1416 exceeds maximum 1352 for index gin_idx All index AMs have similar restrictions. Yes, I know and I have no problem with restrictions in general. You may run into similar issues with btree indexes on text columns with long text, for example. The thing is that people don't generally index text directly, because it usually does not make much sense, but using tsvector etc. But with jsonb it's more likely because indexing is one of the goodies (at least for me). And the discussions with several people interested in storing json data I had recently went often like this: me: It seems we'll have a better json datatype in 9.4. them: Nice! me: And it will be possible to do searches on arbitrary keys. them: Yay! me: And we actually got pretty significant improvements in GIN indexes. them: Awesome! me: But the values you may index need to be less than ~1500B. them: Bummer :-( me: Well, you can use GIST then. A good example of such header is dkim-signature which basically contains the whole message digitally signed with DKIM. The signature tends to be long and non-compressible, thanks to the signature. I'm wondering what's the best way around this, because I suspect many new users (especially those attracted by jsonb and GIN improvements) will run into this. Maybe not immediately, but eventully they'll try to insert a jsonb with long value, and it will fail ... The jsonb_hash_ops operator class just stores a 32-bit integer hash value (it always sets the recheck flag, which only some of the other default GIN opclass' strategies do). It only supports containment, and not the full variety of operators that the default opclass supports, which is why it isn't the default. I think that in practice the general recommendation will be that when indexing at the top level, use jsonb_hash_ops. When indexing nested items, use the more flexible default GIN opclass. That seems like a pretty smart trade-off to me. OK, I'll look into the jsonb_hash_ops - that sounds more or less like what I was thinking about (and sure, storing hashes makes some operations impossible to support). The other thing I was thinking about is introducing some kind of upper limit for the value length - e.g. index just the first 1kB, or something like that. My experience is most values are way shorter, or actually differ in the first 1kB, so this should allow most decisions to be made. But I'm not really that familiar with how GIN works, so maybe this is nonsense. The more I think about it, the more inclined I am to lose GiST support entirely for the time being. It lets us throw out about 700 lines of C code, which is a very significant fraction of the total, removes the one open bug, and removes the least understood part of the code. The GiST opclass is not particularly compelling for this. I disagree with that. I see GiST as a simple fallback option for the cases I described. I wasn't able to create a GIN index because of exceeding the max item length, but GiST created just fine. It was considerably slower, but it worked. Tomas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On Wed, Mar 12, 2014 at 6:20 AM, Tomas Vondra t...@fuzzy.cz wrote: I'm still not sure how would that look. Does that mean I'd have to create multiple GIN indexes - one for each possible key or something like that? Can you give an example? It could mean that you're obliged to create multiple indexes, yes. For an example, and to get a better sense of what I mean, look at the documentation in the patch. The idea that you're going to create one index on a jsonb, and it's going to be able to usefully index a lot of different queries doesn't seem practical for most use-cases. Mostly, people will have fairly homogeneous json documents, and they'll want to index certain nested fields common to all or at least a large majority of those documents. By indexing entire jsonb datums, do you hope to get much benefit out of the indexed values (as opposed to keys) being stored (in serialized form) in the GIN index? Because you *are* indexing a large nested structure as a value. Is that large nested structure going to appear in your query predicate, or are you just going to subscript the jsonb to get to the level that's of interest to query that? I'm pretty sure that people want the latter. Are you sure that your complaint isn't just that the default GIN opclass indexes values (as distinct from keys) that are large and unwieldy, and not terribly useful? I don't think expressional indexes are some kind of unfortunate work around for a jsonb limitation. I think that they're the natural way to approach indexing a nested structure in Postgres. MongoDB, for example, does not magically index everything. You're still required to make choices about indexing that consider the access patterns. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On Wed, Mar 12, 2014 at 11:57 AM, Oleg Bartunov obartu...@gmail.com wrote: Also, GiST index is faster for create/update operations. I really hope we will improve jsonb indexing in the next one-two releases. For now I'd suggest people index expressional indexes to index just interesting keys or use GiST. When do you ever want to index non-interesting keys? -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On Thu, Mar 13, 2014 at 12:10 AM, Peter Geoghegan p...@heroku.com wrote: On Wed, Mar 12, 2014 at 11:57 AM, Oleg Bartunov obartu...@gmail.com wrote: Also, GiST index is faster for create/update operations. I really hope we will improve jsonb indexing in the next one-two releases. For now I'd suggest people index expressional indexes to index just interesting keys or use GiST. When do you ever want to index non-interesting keys? Regular user may just index all keys. I mean, that json can contains keys, which are not searched, so it's not needed to index them and save index size. We probably could provide option in CREATE INDEX to specify what to index and what not index, but it require planner to know that information. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On 03/12/2014 04:10 PM, Peter Geoghegan wrote: On Wed, Mar 12, 2014 at 11:57 AM, Oleg Bartunov obartu...@gmail.com wrote: Also, GiST index is faster for create/update operations. I really hope we will improve jsonb indexing in the next one-two releases. For now I'd suggest people index expressional indexes to index just interesting keys or use GiST. When do you ever want to index non-interesting keys? The problem is when do you know they are interesting? One major use case for using treeish data types in the first place is that you don't know when you're designing the database exactly what shape the data will be. If you don't know that, then how are you supposed to know what in it will be interesting? It's somewhat analogous to full text indexing, where we don't know in advance what phrases or words will be interesting. Here, a key is the equivalent of a word and a key path or subpath is the equivalent of a phrase. Maybe I'm dreaming, since I have no idea how to go about this sort of indexing, but it's where I'd like to see lots of effort. I agree with Oleg that we need to be very creative about jsonb indexing. One of my hopes is that by going down the road we are on, we'll get much wider interest in this, and that both ideas and money might flow towards addressing it in a way that we probably wouldn't have seen otherwise. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
Andrew, Peter: Just so I'm clear on the limits here, lemme make sure I understand this: a) GIN indexing is limited to ~~1500chars b) The value, which includes everything other than the top level set of keys, is one item as far as GIN is concerned. Therefore: we are limited to indexing JSON where nothing below a top-level key is more than 1500bytes? I'm asking for documentation purposes. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On Wed, Mar 12, 2014 at 1:37 PM, Andrew Dunstan and...@dunslane.net wrote: One major use case for using treeish data types in the first place is that you don't know when you're designing the database exactly what shape the data will be. If you don't know that, then how are you supposed to know what in it will be interesting? It's somewhat analogous to full text indexing, where we don't know in advance what phrases or words will be interesting. Here, a key is the equivalent of a word and a key path or subpath is the equivalent of a phrase. You don't know exactly how, but you have some idea. The major benefit is that you can add new things to new documents as the need arises, and that's not a big deal, nor does it require a migration with DDL. If we continue to take MongoDB as representative of how people will use jsonb, they pretty strongly encourage the idea that you have to have some structure or design. Google mongodb schema design to see what I mean - you'll find plenty. It has more to do with making querying the data possible than anything else. There is a limited amount you can do with a bunch of documents that share little in common in terms of their structure - what does a query (that can use an index just in principle) even look like there? The use case you describe here doesn't sound like something similar to full text search. It sounds like something identical. In any case, let's focus on what we have right now. I think that the indexing facilities proposed here are solid. In any case they do not preclude working on better indexing strategies as the need emerges. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On 03/12/2014 04:58 PM, Peter Geoghegan wrote: In any case, let's focus on what we have right now. I think that the indexing facilities proposed here are solid. In any case they do not preclude working on better indexing strategies as the need emerges. I quite agree, didn't mean to suggest otherwise. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On 12.3.2014 20:40, Peter Geoghegan wrote: On Wed, Mar 12, 2014 at 6:20 AM, Tomas Vondra t...@fuzzy.cz wrote: I'm still not sure how would that look. Does that mean I'd have to create multiple GIN indexes - one for each possible key or something like that? Can you give an example? It could mean that you're obliged to create multiple indexes, yes. For an example, and to get a better sense of what I mean, look at the documentation in the patch. OK, will do. The idea that you're going to create one index on a jsonb, and it's going to be able to usefully index a lot of different queries doesn't seem practical for most use-cases. Mostly, people will have fairly homogeneous json documents, and they'll want to index certain nested fields common to all or at least a large majority of those documents. I think that's unfounded assumption. Many users actually have very little control over the documents or queries - a nice example may be the mail archive, with headers stored in a hstore/jsonb. I have absolutely no control over the headers or queries. But I think this is a feedback loop too - what if many users actually want that functionality, but realize that expression indexes are not sufficient for their needs and thus don't even try (and so we don't hear about them)? And my experience is that this is actualy one of the very cool hstore features - being able to index the whole structure and then do arbitrary queries over that. The only reason why I'm looking at jsonb is that it the improved support for data types (especially arrays). So I have my doubts about the claims that users have homogenous documents and only want to index some fields with expression indexes. By indexing entire jsonb datums, do you hope to get much benefit out of the indexed values (as opposed to keys) being stored (in serialized form) in the GIN index? Because you *are* indexing a large nested structure as a value. Is that large nested structure going to appear in your query predicate, or are you just going to subscript the jsonb to get to the level that's of interest to query that? I'm pretty sure that people want the latter. Are you sure that your complaint isn't just that the default GIN opclass indexes values (as distinct from keys) that are large and unwieldy, and not terribly useful? No, I don't expect a large nested structure to appear in the query. And I expect most people won't need that, although I can imagine queries @ doing that (not sure if that checks for equality or 'subset'). But I'm not sure I understand how's this related to my original post? All I was asking whether it wouldn't be enough to store a hash instead of the original value, i.e. instead of this: {from : j...@example.com, to : j...@example.com, content-type : text/plain; charset=us-ascii, dkim-signature : vry long value } this {129812 : 29382, 459821 : 1029381, 21083 : 102941, 111390 : 129010292} which would solve issues with the long values and might still support the queries (with recheck, of course). I don't know if that's what jsonb_hash_ops do or if it's even possible / compatible with GIN. I don't think expressional indexes are some kind of unfortunate work around for a jsonb limitation. I think that they're the natural way to approach indexing a nested structure in Postgres. MongoDB, for example, does not magically index everything. You're still required to make choices about indexing that consider the access patterns. For many usecases, expressional indexes are the right tool. But not for all and I see no reason to just throw some tools away. regards Tomas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On 12.3.2014 21:58, Peter Geoghegan wrote: The use case you describe here doesn't sound like something similar to full text search. It sounds like something identical. I think this very depends on the definition of full text search. In any case, let's focus on what we have right now. I think that the indexing facilities proposed here are solid. In any case they do not preclude working on better indexing strategies as the need emerges. +1 Tomas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On Wed, Mar 12, 2014 at 2:30 PM, Tomas Vondra t...@fuzzy.cz wrote: I think that's unfounded assumption. Many users actually have very little control over the documents or queries - a nice example may be the mail archive, with headers stored in a hstore/jsonb. I have absolutely no control over the headers or queries. Maybe, but what do you want me to do to help them? Indexing a typical jsonb field is a bad idea, unless you really do want something essentially equivalent to full text search (which could be justified), or unless you know ahead of time that your documents are not going to be heavily nested. The whole basis of your complaints seems to be that people won't know that at all. For many usecases, expressional indexes are the right tool. But not for all and I see no reason to just throw some tools away. If the tool you're talking about throwing away is the GiST opclass, I do not propose to throw that away. I don't think it's important enough to justify inclusion in our first cut at this, especially given the fact that the code has bugs, and is quite a bit more complex than GIN. What's wrong with those reasons? -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On 12.3.2014 21:55, Josh Berkus wrote: Andrew, Peter: Just so I'm clear on the limits here, lemme make sure I understand this: a) GIN indexing is limited to ~~1500chars The exact message I get is this: ERROR: index row size 1944 exceeds maximum 1352 for index tmp_idx so it's 1352B. But IIRC this is closely related to block size, so with larger block sizes you'll get different limits. Also, this is a limit on compressed value, which makes it less user-friendly as it's difficult to predict whether the row is OK or not :-( And I just discovered this: create table tmp (val jsonb); create index tmp_gin_idx on tmp using gin (val); insert into tmp select ('{z : ' || repeat('z', 100) || '}')::jsonb; which tries to insert a well-compressible string ('z' repeated 1e6-times), and fails with this: ERROR: index row requires 11472 bytes, maximum size is 8191 So I think it's quite difficult to give simple and exact explanation in the docs, other than there are limits, but it's difficult to say when you hit them. Tomas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
* Tomas Vondra (t...@fuzzy.cz) wrote: So I think it's quite difficult to give simple and exact explanation in the docs, other than there are limits, but it's difficult to say when you hit them. Arrays have more-or-less the same issue... Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] jsonb and nested hstore
On 12.3.2014 22:43, Peter Geoghegan wrote: On Wed, Mar 12, 2014 at 2:30 PM, Tomas Vondra t...@fuzzy.cz wrote: I think that's unfounded assumption. Many users actually have very little control over the documents or queries - a nice example may be the mail archive, with headers stored in a hstore/jsonb. I have absolutely no control over the headers or queries. Maybe, but what do you want me to do to help them? Indexing a typical jsonb field is a bad idea, unless you really do want something essentially equivalent to full text search (which could be justified), or unless you know ahead of time that your documents are not going to be heavily nested. The whole basis of your complaints seems to be that people won't know that at all. Well, I would be quite happy with the GIN indexing without the limit I ran into. I don't think we need to invent something entirely new. You're right that the index is pretty futile with a condition matching field/value combination. But what if I'm doing a query with multiple such conditions, and the combination matches just a small fraction of rows? GIN index works with that (and the patches from Alexander improve this case tremendously, IIRC). I still don't understand how's this similar to fulltext - that seems pretty unsuitable for a treeish structure, assuming you can't flatten it. Which you can't, if the queries use paths to access just parts of the json value. For many usecases, expressional indexes are the right tool. But not for all and I see no reason to just throw some tools away. If the tool you're talking about throwing away is the GiST opclass, I do not propose to throw that away. I don't think it's important enough to justify inclusion in our first cut at this, especially given the fact that the code has bugs, and is quite a bit more complex than GIN. What's wrong with those reasons? Meh, I accidentally mixed two responses :-/ I have no problem with expression indexes, but it's not a good solution to all problems. I certainly can't use them to achieve what I'd like and I disagree with your assumptions that it doesn't make sense to index everything / non-interesting keys, or that the documents have well-defined structure. I can live with larger / less efficient indexes on all fields. Regarding GiST - I understand your concerns about complexity, and you may be right that not shipping it now is prefferable to shipping it with bugs. The thing is it doesn't have issues with the value lengths, which prevents me from using GIN, and although GiST is slower, it's at least some indexing. But maybe jsonb_hash_ops will work, I haven't tried yet. regards Tomas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On Tue, Mar 11, 2014 at 5:19 AM, Peter Geoghegan p...@heroku.com wrote: On Mon, Mar 10, 2014 at 4:19 AM, Alexander Korotkov aekorot...@gmail.com wrote: Here it is. So it looks like what you have here is analogous to the other problems that I fixed with both GiST and GIN. That isn't surprising, and this does fix my test-case. I'm not terribly happy about the lack of explanation for the hashing in that loop, though. Why use COMP_CRC32() at all, for one thing? Why do this for non-primitive jsonb hashing? COMP_CRC32(stack-hash_state, PATH_SEPARATOR, 1); Where PATH_SEPARATOR is: #define PATH_SEPARATOR (\0) Actually, come to think of it, why not just use one hashing function everywhere? i.e., jsonb_hash(PG_FUNCTION_ARGS)? It's already very similar. Pretty much every hash operator support function 1 (i.e. a particular type's hash function) is implemented with hash_any(). Can't we just do the same here? In any case it isn't obvious why the requirements for those two things (the hashing mechanism used by the jsonb_hash_ops GIN opclass, and the hash operator class support function 1 hash function) cannot be the same thing. It's because CRC32 interface allows incremental calculation while hash_any requires single chunk of memory. I don't think that unfolding everything is good idea. But we could implement incremental interface for hash_any. -- With best regards, Alexander Korotkov.
Re: [HACKERS] jsonb and nested hstore
Hi, I've spent a few hours stress-testing this a bit - loading a mail archive with ~1M of messages (with headers stored in a jsonb column) and then doing queries on that. Good news - no crashes or any such issues so far. The queries that I ran manually seem to return sane results. The only problem I ran into is with limited index row size with GIN indexes. I understand it's not a bug, but I admit I haven't realized I might run into it in this case ... The data I used for testing is just a bunch of e-mail messages, with headers stored as jsonb, so each row has something like this in headers column: { from : John Doe j...@example.com, to : [Jane Doe j...@example.com, Jack Doe j...@example.com], cc : ..., bcc : ..., ... various other headers ... } The snag is that some of the header values may be very long, exceeding the limit of 1352 bytes and causing errors like this: ERROR: index row size 1416 exceeds maximum 1352 for index gin_idx A good example of such header is dkim-signature which basically contains the whole message digitally signed with DKIM. The signature tends to be long and non-compressible, thanks to the signature. I'm wondering what's the best way around this, because I suspect many new users (especially those attracted by jsonb and GIN improvements) will run into this. Maybe not immediately, but eventully they'll try to insert a jsonb with long value, and it will fail ... With btree indexes on text I would probably create an index on substr(column,0,1000) or something like that, but doing that with JSON seems a bit strange. I assume we need to store the actual values in the GIN index (so a hash is not sufficient), right? GIST indexes work, but with that I have to give up the significant performance gains that we got thanks to Alexander's GIN patches. regards Tomas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On Tue, Mar 11, 2014 at 3:58 PM, Tomas Vondra t...@fuzzy.cz wrote: ERROR: index row size 1416 exceeds maximum 1352 for index gin_idx All index AMs have similar restrictions. A good example of such header is dkim-signature which basically contains the whole message digitally signed with DKIM. The signature tends to be long and non-compressible, thanks to the signature. I'm wondering what's the best way around this, because I suspect many new users (especially those attracted by jsonb and GIN improvements) will run into this. Maybe not immediately, but eventully they'll try to insert a jsonb with long value, and it will fail ... The jsonb_hash_ops operator class just stores a 32-bit integer hash value (it always sets the recheck flag, which only some of the other default GIN opclass' strategies do). It only supports containment, and not the full variety of operators that the default opclass supports, which is why it isn't the default. I think that in practice the general recommendation will be that when indexing at the top level, use jsonb_hash_ops. When indexing nested items, use the more flexible default GIN opclass. That seems like a pretty smart trade-off to me. The more I think about it, the more inclined I am to lose GiST support entirely for the time being. It lets us throw out about 700 lines of C code, which is a very significant fraction of the total, removes the one open bug, and removes the least understood part of the code. The GiST opclass is not particularly compelling for this. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On Tue, Mar 11, 2014 at 4:41 PM, Peter Geoghegan p...@heroku.com wrote: I think that in practice the general recommendation will be that when indexing at the top level, use jsonb_hash_ops. When indexing nested items, use the more flexible default GIN opclass. That seems like a pretty smart trade-off to me. By which I mean: index nested items using an expressional GIN index. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On Mon, Mar 10, 2014 at 1:18 PM, Peter Geoghegan p...@heroku.com wrote: * The jsonb_hash_ops non-default GIN opclass is broken. It has its own idiosyncratic notion of what constitutes containment, that sees it only return, say, jsonb arrays that have a matching string as their leftmost element (if we ask it if it contains within it another array with the same string). Because of the limited number of indexable operators (only @), I'd put this opclass in the same category as GiST in terms of my willingness to forgo it for a release, even if it did receive a loud applause at pgConf.EU. Again, it might be some disparity between the opertors as they existed in hstore2 at one time, and as they exist in the core code now, but I doubt it, not least since the regression tests didn't pick this up, and it's such a basic thing. Perhaps Oleg and Teodor just need to explain this to me. I din't get comment about leftmost element. There is absolutely no distinguish between array elements. All elements are extracted into same keys independent of their indexes. It seems to have no change since I wrote hstore_hash_ops. Could you share test case to illustrate what you mean? -- With best regards, Alexander Korotkov.
Re: [HACKERS] jsonb and nested hstore
On Mon, Mar 10, 2014 at 3:00 AM, Alexander Korotkov aekorot...@gmail.com wrote: I din't get comment about leftmost element. There is absolutely no distinguish between array elements. All elements are extracted into same keys independent of their indexes. It seems to have no change since I wrote hstore_hash_ops. Could you share test case to illustrate what you mean? I don't have time to post that at the moment, but offhand I *think* your confusion may be due to the fact that the json_hash_ops opclass (as I call it) was previously consistent with the behavior of the other GIN opclass (the default). The problem is that they (well, at least the default GIN and GiST opclasses) were inconsistent with how the containment operator behaved in respect of jsonb array elements generally. Here is the commit on our feature branch where I fixed the problem for the default GIN opclass: https://github.com/feodor/postgres/commit/6f5e4fe9fc34f9512919b1c8b6a54952ab288640s If it doesn't explain the problem, you may still wish to comment on the correctness of this fix. I am still waiting on feedback from Oleg and Teodor. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On Mon, Mar 10, 2014 at 3:19 AM, Peter Geoghegan p...@heroku.com wrote: I don't have time to post that at the moment, but offhand I *think* your confusion may be due to the fact that the json_hash_ops opclass (as I call it) was previously consistent with the behavior of the other GIN opclass (the default). The problem is that they (well, at least the default GIN and GiST opclasses) were inconsistent with how the containment operator behaved in respect of jsonb array elements generally. Sorry, I realize now that that must be incorrect. Still, please take a look at the commit linked to. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On Mon, Mar 10, 2014 at 3:21 AM, Peter Geoghegan p...@heroku.com wrote: Sorry, I realize now that that must be incorrect. Still, please take a look at the commit linked to. To be clear, I mean that my explanation of why this was missed before was incorrect, not my contention that it's a problem right now (for whatever reason). I fat-fingered the URL that linked to the GIN opclass bugfix commit: https://github.com/feodor/postgres/commit/6f5e4fe9fc34f9512919b1c8b6a54952ab288640 -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On Mon, Mar 10, 2014 at 2:19 PM, Peter Geoghegan p...@heroku.com wrote: On Mon, Mar 10, 2014 at 3:00 AM, Alexander Korotkov aekorot...@gmail.com wrote: I din't get comment about leftmost element. There is absolutely no distinguish between array elements. All elements are extracted into same keys independent of their indexes. It seems to have no change since I wrote hstore_hash_ops. Could you share test case to illustrate what you mean? I don't have time to post that at the moment, but offhand I *think* your confusion may be due to the fact that the json_hash_ops opclass (as I call it) was previously consistent with the behavior of the other GIN opclass (the default). The problem is that they (well, at least the default GIN and GiST opclasses) were inconsistent with how the containment operator behaved in respect of jsonb array elements generally. Here is the commit on our feature branch where I fixed the problem for the default GIN opclass: https://github.com/feodor/postgres/commit/6f5e4fe9fc34f9512919b1c8b6a54952ab288640s If it doesn't explain the problem, you may still wish to comment on the correctness of this fix. I am still waiting on feedback from Oleg and Teodor. Apparently, there is bug in calculation of hashes. Array elements were hashed incrementally while each of them should be hashed separately. That cause an effect of distinguishing array elements by their indexes. Not sure about when this bug was added. Fix is attached. -- With best regards, Alexander Korotkov. jsonb-hash-ops-fix.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On Mon, Mar 10, 2014 at 3:47 AM, Alexander Korotkov aekorot...@gmail.com wrote: Fix is attached. Could you post a patch with regression tests, please? -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On Mon, Mar 10, 2014 at 3:04 PM, Peter Geoghegan p...@heroku.com wrote: On Mon, Mar 10, 2014 at 3:47 AM, Alexander Korotkov aekorot...@gmail.com wrote: Fix is attached. Could you post a patch with regression tests, please? Here it is. -- With best regards, Alexander Korotkov. jsonb-hash-ops-fix-2.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On 03/10/2014 05:18 AM, Peter Geoghegan wrote: On Fri, Mar 7, 2014 at 9:00 AM, Bruce Momjian br...@momjian.us wrote: OK, it sounds like the adjustments are minimal, like not using the high-order bit. Attached patch is a refinement of the work of Oleg, Teodor and Andrew. Revisions are mostly my own, although Andrew contributed too. Changes include: * Extensive relocation, and moderate restructuring of code. Many comments added, while many existing comments were copy-edited. Nothing remains in contrib. jsonb is a distinct, in-core type with no user-visible relationship to hstore. There is no code dependency between the two. The amount of code redundancy this turned out to create (between jsonb and an unchanged hstore) is, in my estimation, quite acceptable. * B-Tree and hash operator classes for the core type are included. A GiST operator class, and two GIN operator classes are also included. Obviously this is where I spent most time by far. * Everything else that was in hstore in the last revision (the complement of the hstore2 opclasses) is removed entirely. The patch is much smaller. If we just consider code (excluding tests and documentation), the diffstat seems far more manageable: Thanks for your work on this. It's just occurred to me that we'll need to add hstore_to_jsonb functions and a cast to match the hstore_to_json functions and cast. That should be fairly simple - I'll work on that. It need not hold up progress with what's in this patch. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On 03/10/2014 10:50 AM, Andrew Dunstan wrote: Thanks for your work on this. It's just occurred to me that we'll need to add hstore_to_jsonb functions and a cast to match the hstore_to_json functions and cast. That should be fairly simple - I'll work on that. It need not hold up progress with what's in this patch. Here's a patch sans docs for this, to be applied on top of Peter's patch. It's actually kinda useful as it demonstrates how non-jsonb code can construct jsonb values directy. cheers andrew diff --git a/contrib/hstore/Makefile b/contrib/hstore/Makefile index 43b7e5f..bf21c65 100644 --- a/contrib/hstore/Makefile +++ b/contrib/hstore/Makefile @@ -5,7 +5,8 @@ OBJS = hstore_io.o hstore_op.o hstore_gist.o hstore_gin.o hstore_compat.o \ crc32.o EXTENSION = hstore -DATA = hstore--1.2.sql hstore--1.1--1.2.sql hstore--1.0--1.1.sql \ +DATA = hstore--1.3.sql hstore--1.2--1.3.sql \ + hstore--1.2.sql hstore--1.1--1.2.sql hstore--1.0--1.1.sql \ hstore--unpackaged--1.0.sql REGRESS = hstore diff --git a/contrib/hstore/expected/hstore.out b/contrib/hstore/expected/hstore.out index 2114143..9749e45 100644 --- a/contrib/hstore/expected/hstore.out +++ b/contrib/hstore/expected/hstore.out @@ -1453,7 +1453,7 @@ select count(*) from testhstore where h = 'pos=98, line=371, node=CBA, indexe 1 (1 row) --- json +-- json and jsonb select hstore_to_json('a key =1, b = t, c = null, d= 12345, e = 012345, f= 1.234, g= 2.345e+4'); hstore_to_json - @@ -1472,6 +1472,24 @@ select hstore_to_json_loose('a key =1, b = t, c = null, d= 12345, e = 012 {b: true, c: null, d: 12345, e: 012345, f: 1.234, g: 2.345e+4, a key: 1} (1 row) +select hstore_to_jsonb('a key =1, b = t, c = null, d= 12345, e = 012345, f= 1.234, g= 2.345e+4'); + hstore_to_jsonb +- + {b: t, c: null, d: 12345, e: 012345, f: 1.234, g: 2.345e+4, a key: 1} +(1 row) + +select cast( hstore 'a key =1, b = t, c = null, d= 12345, e = 012345, f= 1.234, g= 2.345e+4' as jsonb); + jsonb +- + {b: t, c: null, d: 12345, e: 012345, f: 1.234, g: 2.345e+4, a key: 1} +(1 row) + +select hstore_to_jsonb_loose('a key =1, b = t, c = null, d= 12345, e = 012345, f= 1.234, g= 2.345e+4'); + hstore_to_jsonb_loose +--- + {b: true, c: null, d: 12345, e: 012345, f: 1.234, g: 23450, a key: 1} +(1 row) + create table test_json_agg (f1 text, f2 hstore); insert into test_json_agg values ('rec1','a key =1, b = t, c = null, d= 12345, e = 012345, f= 1.234, g= 2.345e+4'), ('rec2','a key =2, b = f, c = null, d= -12345, e = 012345.6, f= -1.234, g= 0.345e-4'); diff --git a/contrib/hstore/hstore--1.2--1.3.sql b/contrib/hstore/hstore--1.2--1.3.sql new file mode 100644 index 000..0a70560 --- /dev/null +++ b/contrib/hstore/hstore--1.2--1.3.sql @@ -0,0 +1,17 @@ +/* contrib/hstore/hstore--1.2--1.3.sql */ + +-- complain if script is sourced in psql, rather than via ALTER EXTENSION +\echo Use ALTER EXTENSION hstore UPDATE TO '1.3' to load this file. \quit + +CREATE FUNCTION hstore_to_jsonb(hstore) +RETURNS jsonb +AS 'MODULE_PATHNAME', 'hstore_to_jsonb' +LANGUAGE C IMMUTABLE STRICT; + +CREATE CAST (hstore AS jsonb) + WITH FUNCTION hstore_to_jsonb(hstore); + +CREATE FUNCTION hstore_to_jsonb_loose(hstore) +RETURNS jsonb +AS 'MODULE_PATHNAME', 'hstore_to_jsonb_loose' +LANGUAGE C IMMUTABLE STRICT; diff --git a/contrib/hstore/hstore--1.3.sql b/contrib/hstore/hstore--1.3.sql new file mode 100644 index 000..995ade1 --- /dev/null +++ b/contrib/hstore/hstore--1.3.sql @@ -0,0 +1,550 @@ +/* contrib/hstore/hstore--1.3.sql */ + +-- complain if script is sourced in psql, rather than via CREATE EXTENSION +\echo Use CREATE EXTENSION hstore to load this file. \quit + +CREATE TYPE hstore; + +CREATE FUNCTION hstore_in(cstring) +RETURNS hstore +AS 'MODULE_PATHNAME' +LANGUAGE C STRICT IMMUTABLE; + +CREATE FUNCTION hstore_out(hstore) +RETURNS cstring +AS 'MODULE_PATHNAME' +LANGUAGE C STRICT IMMUTABLE; + +CREATE FUNCTION hstore_recv(internal) +RETURNS hstore +AS 'MODULE_PATHNAME' +LANGUAGE C STRICT IMMUTABLE; + +CREATE FUNCTION hstore_send(hstore) +RETURNS bytea +AS 'MODULE_PATHNAME' +LANGUAGE C STRICT IMMUTABLE; + +CREATE TYPE hstore ( +INTERNALLENGTH = -1, +INPUT = hstore_in, +OUTPUT = hstore_out, +RECEIVE = hstore_recv, +SEND = hstore_send, +STORAGE =
Re: [HACKERS] jsonb and nested hstore
On Mon, Mar 10, 2014 at 4:19 AM, Alexander Korotkov aekorot...@gmail.com wrote: Here it is. So it looks like what you have here is analogous to the other problems that I fixed with both GiST and GIN. That isn't surprising, and this does fix my test-case. I'm not terribly happy about the lack of explanation for the hashing in that loop, though. Why use COMP_CRC32() at all, for one thing? Why do this for non-primitive jsonb hashing? COMP_CRC32(stack-hash_state, PATH_SEPARATOR, 1); Where PATH_SEPARATOR is: #define PATH_SEPARATOR (\0) Actually, come to think of it, why not just use one hashing function everywhere? i.e., jsonb_hash(PG_FUNCTION_ARGS)? It's already very similar. Pretty much every hash operator support function 1 (i.e. a particular type's hash function) is implemented with hash_any(). Can't we just do the same here? In any case it isn't obvious why the requirements for those two things (the hashing mechanism used by the jsonb_hash_ops GIN opclass, and the hash operator class support function 1 hash function) cannot be the same thing. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On Thu, Mar 6, 2014 at 10:33 PM, Bruce Momjian br...@momjian.us wrote: On Thu, Mar 6, 2014 at 09:50:56PM +0400, Oleg Bartunov wrote: Hi there, Looks like consensus is done. I and Teodor are not happy with it, but what we can do :) One thing I want to do is to reserve our contribution to the flagship feature (jsonb), particularly, binary storage for nested structures and indexing. Their work was sponsored by Engine Yard. OK, if we are going with an unchanged hstore in contrib and a new JSONB, there is no reason to wack around JSONB to be binary compatible with the old hstore format. What sacrifices did we need to make to have JSBONB be binary compatible with hstore, can those sacrifices be removed, and can that be done in time for 9.4? Also, *) what hstore2 features (if any) that are not already reflected in the jsonb type are going to be moved to josnb for 9.4? *) if the answer above is anything but 'nothing', what hstore-isms are going to be adjusted in the process of doing so? Presumably there would be same function name changes to put them in the jsonb style but also the hstore sytnax ('=') is going to be embedded in some of the search operators and possibly other things. Is that going change? merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On 03/06/2014 11:33 PM, Bruce Momjian wrote: On Thu, Mar 6, 2014 at 09:50:56PM +0400, Oleg Bartunov wrote: Hi there, Looks like consensus is done. I and Teodor are not happy with it, but what we can do :) One thing I want to do is to reserve our contribution to the flagship feature (jsonb), particularly, binary storage for nested structures and indexing. Their work was sponsored by Engine Yard. OK, if we are going with an unchanged hstore in contrib and a new JSONB, there is no reason to wack around JSONB to be binary compatible with the old hstore format. What sacrifices did we need to make to have JSBONB be binary compatible with hstore, can those sacrifices be removed, and can that be done in time for 9.4? IIRC The sacrifice was one bit in the header (i.e. in the first int after the varlena header). We could now repurpose that (for example if we ever decided to use a new format). Oleg and Teodor made most of the adjustments on the hstore(2) side (e.g. providing for scalar roots, providing for json typing of scalars so everything isn't just a string). Can the architecture be changed? No. If we think it's not good enough we would have to kiss jsonb goodbye for 9.4 and go back to the drawing board. But I haven't seen any such suggestion from anyone who has been reviewing it (e.g. Andres or Peter). cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On Fri, Mar 7, 2014 at 11:35:41AM -0500, Andrew Dunstan wrote: IIRC The sacrifice was one bit in the header (i.e. in the first int after the varlena header). We could now repurpose that (for example if we ever decided to use a new format). Oleg and Teodor made most of the adjustments on the hstore(2) side (e.g. providing for scalar roots, providing for json typing of scalars so everything isn't just a string). Can the architecture be changed? No. If we think it's not good enough we would have to kiss jsonb goodbye for 9.4 and go back to the drawing board. But I haven't seen any such suggestion from anyone who has been reviewing it (e.g. Andres or Peter). We are going to be stuck with the JSONB binary format we ship in 9.4 so I am asking if there are things we should do to improve it, now that we know we don't need backward compatibility. If they can be done for 9.4, great, if not, we have to decide if these suboptimal cases are enough for us to delay the data type until 9.5. I don't know the answer, but I have to ask the question. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On 03/07/2014 11:45 AM, Bruce Momjian wrote: On Fri, Mar 7, 2014 at 11:35:41AM -0500, Andrew Dunstan wrote: IIRC The sacrifice was one bit in the header (i.e. in the first int after the varlena header). We could now repurpose that (for example if we ever decided to use a new format). Oleg and Teodor made most of the adjustments on the hstore(2) side (e.g. providing for scalar roots, providing for json typing of scalars so everything isn't just a string). Can the architecture be changed? No. If we think it's not good enough we would have to kiss jsonb goodbye for 9.4 and go back to the drawing board. But I haven't seen any such suggestion from anyone who has been reviewing it (e.g. Andres or Peter). We are going to be stuck with the JSONB binary format we ship in 9.4 so I am asking if there are things we should do to improve it, now that we know we don't need backward compatibility. If they can be done for 9.4, great, if not, we have to decide if these suboptimal cases are enough for us to delay the data type until 9.5. I don't know the answer, but I have to ask the question. AFAIK, there is no sacrifice of optimality. hstore2 and jsonb were essentially two ways of spelling the same data, the domains were virtually identical (hstore might have been a bit more liberal about numeric input). cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On Fri, Mar 7, 2014 at 11:57:48AM -0500, Andrew Dunstan wrote: If they can be done for 9.4, great, if not, we have to decide if these suboptimal cases are enough for us to delay the data type until 9.5. I don't know the answer, but I have to ask the question. AFAIK, there is no sacrifice of optimality. hstore2 and jsonb were essentially two ways of spelling the same data, the domains were virtually identical (hstore might have been a bit more liberal about numeric input). OK, it sounds like the adjustments are minimal, like not using the high-order bit. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On Mar 6, 2014, at 1:51 AM, Peter Geoghegan p...@heroku.com wrote: It's true for perl. Syntax of hstore is close to hash/array syntax and it's easy serialize/deserialize hstore to/from perl. Syntax of hstore was inspired by perl. I understand that. There is a module on CPAN called Pg::hstore that will do this; it appears to have been around since 2011. I don't use Perl, so I don't know a lot about it. Perhaps David Wheeler has an opinion on the value of Perl-like syntax, as a long time Perl enthusiast? HSTORE was inspired by the syntax of Perl hash declarations, but it is not compatible. Notably, HSTORE the HSTORE can have a value `NULL`, while in Perl hashes it’s `undef`. So you cannot simply `eval` an HSTORE to get a Perl hash unless you are certain there are no NULLs. Besides, string eval in Perl is considered unsafe. Parsing is *much* safer. In any case, Perl has excellent support for JSON, just like every other language - you are at no particular advantage in Perl by having a format that happens to more closely resemble the format of Perl hashes and arrays. I really feel that we should concentrate our efforts on one standardized format here. It makes the effort to integrate your good work, in a way that makes it available to everyone so much easier. I agree. I like HSTORE, but now that JSON is so standard (in fact, as of this week, a *real* standard! http://rfc7159.net/rfc7159), and its support is so much better than that of HSTORE, including in Perl, I believe that it should be priority over HSTORE. I’m happy if HSTORE has the same functionality as JSONB, but given the choice, all other things being equal, as a Perl hacker I will always choose JSONB. Best, David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
Thank you for checking that. Teodor's goal was that new-hstore be 100% backwards-compatible with old-hstore. If we're breaking APIs, then it That's true. Binary format is fully compatible unless old hstore value has more than 2^28 key-value pairs (256 mln which is far from reachable by memory requirements). The single issue is a GiST index, GIN index should be recreated to utilize new features. doesn't really work to force users to upgrade the type, no? Teodor, are these output changes things that can be made consistent, or do we need separate hstore and hstore2 datatypes? Introducing types in hstore causes this incompatibility - but I don't think that's huge or even big problem. In most cases application does quoting (sets 1 instead of just 1) to preserve SQL-injection and to protect hstore-forbidden characters in hstore. Keys leaves untouched - it could be only a string. That's possible to introduce GUC variable for i/o functions which will control old bug-to-bug behavior. IMHO, this is much better option that stopping hstore development or split hstore to two branches. -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On Thu, Mar 6, 2014 at 12:23 AM, Teodor Sigaev teo...@sigaev.ru wrote: That's possible to introduce GUC variable for i/o functions which will control old bug-to-bug behavior. IMHO, this is much better option that stopping hstore development or split hstore to two branches. A GUC that controls i/o functions is generally considered to be an unacceptable hack. In what sense are we really stopping hstore development if hstore2 lives as jsonb? I have a hard time imagining someone dealing with the incompatibility that a user-facing hstore2 would introduce, while still preferring hstore syntax over json syntax given the choice. There are very rich facilities for manipulating json available in every programming language. The same is not true of hstore. Having looked at the issue today, I think that the amount of redundant code between a hstore2 in core as jsonb and hstore1 will be acceptable. The advantages of making a clean-break in having to support the legacy hstore disk format strengthen the case for doing so too. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
In what sense are we really stopping hstore development if hstore2 lives as jsonb? I have a hard time imagining someone dealing with the incompatibility that a user-facing hstore2 would introduce, while still preferring hstore syntax over json syntax given the choice. There are very rich facilities for manipulating json available in every programming language. The same is not true of hstore. It's true for perl. Syntax of hstore is close to hash/array syntax and it's easy serialize/deserialize hstore to/from perl. Syntax of hstore was inspired by perl. -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On Thu, Mar 6, 2014 at 1:32 AM, Teodor Sigaev teo...@sigaev.ru wrote: It's true for perl. Syntax of hstore is close to hash/array syntax and it's easy serialize/deserialize hstore to/from perl. Syntax of hstore was inspired by perl. I understand that. There is a module on CPAN called Pg::hstore that will do this; it appears to have been around since 2011. I don't use Perl, so I don't know a lot about it. Perhaps David Wheeler has an opinion on the value of Perl-like syntax, as a long time Perl enthusiast? In any case, Perl has excellent support for JSON, just like every other language - you are at no particular advantage in Perl by having a format that happens to more closely resemble the format of Perl hashes and arrays. I really feel that we should concentrate our efforts on one standardized format here. It makes the effort to integrate your good work, in a way that makes it available to everyone so much easier. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On Thu, Mar 6, 2014 at 12:43 PM, Peter Geoghegan p...@heroku.com wrote: On Thu, Mar 6, 2014 at 12:23 AM, Teodor Sigaev teo...@sigaev.ru wrote: That's possible to introduce GUC variable for i/o functions which will control old bug-to-bug behavior. IMHO, this is much better option that stopping hstore development or split hstore to two branches. A GUC that controls i/o functions is generally considered to be an unacceptable hack. In what sense are we really stopping hstore development if hstore2 lives as jsonb? I have a hard time imagining someone dealing with the incompatibility that a user-facing hstore2 would introduce, while still preferring hstore syntax over json syntax given the choice. There are very rich facilities for manipulating json available in every programming language. The same is not true of hstore. Having looked at the issue today, I think that the amount of redundant code between a hstore2 in core as jsonb and hstore1 will be acceptable. The advantages of making a clean-break in having to support the legacy hstore disk format strengthen the case for doing so too. Heh, let's not to do an implusive decision about hstore2. I agree, that jsonb has a lot of facilities, but don't forget, that json(b) has to follow standard and in that sense it's more constrained than hstore, which we could further develop to support some interesting features, which will never be implemented in json(b). Also, it'd be a bit awkward after working on nested hstore and declaring it on several conferences (Engine Yard has sponsored part of our hstore work), suddenly break people expectation and say, that our work has moved to core to provide json some very cool features, good bye, hstore users :( I'm afraid people will not understand us. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On 03/06/2014 08:16 AM, Oleg Bartunov wrote: On Thu, Mar 6, 2014 at 12:43 PM, Peter Geoghegan p...@heroku.com wrote: On Thu, Mar 6, 2014 at 12:23 AM, Teodor Sigaev teo...@sigaev.ru wrote: That's possible to introduce GUC variable for i/o functions which will control old bug-to-bug behavior. IMHO, this is much better option that stopping hstore development or split hstore to two branches. A GUC that controls i/o functions is generally considered to be an unacceptable hack. In what sense are we really stopping hstore development if hstore2 lives as jsonb? I have a hard time imagining someone dealing with the incompatibility that a user-facing hstore2 would introduce, while still preferring hstore syntax over json syntax given the choice. There are very rich facilities for manipulating json available in every programming language. The same is not true of hstore. Having looked at the issue today, I think that the amount of redundant code between a hstore2 in core as jsonb and hstore1 will be acceptable. The advantages of making a clean-break in having to support the legacy hstore disk format strengthen the case for doing so too. Heh, let's not to do an implusive decision about hstore2. I agree, that jsonb has a lot of facilities, but don't forget, that json(b) has to follow standard and in that sense it's more constrained than hstore, which we could further develop to support some interesting features, which will never be implemented in json(b). Also, it'd be a bit awkward after working on nested hstore and declaring it on several conferences (Engine Yard has sponsored part of our hstore work), suddenly break people expectation and say, that our work has moved to core to provide json some very cool features, good bye, hstore users :( I'm afraid people will not understand us. Oleg, I hear you, and largely agree, as long as the compatibility issue is solved. If it's not, I think inventing a new hstore2 type is probably a lousy way to go. For good or ill, the world has pretty much settled on wanting to use json for lightweight treeish data. That's where we'll get the most impact. Virtually every programming language (including Perl) has good support for json. I'm not sure what the constraints of json that you might want to break are. Perhaps you'd like to specify. Whatever we do, rest assured your work won't go to waste. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On Thu, Mar 6, 2014 at 09:33:18AM -0500, Andrew Dunstan wrote: I hear you, and largely agree, as long as the compatibility issue is solved. If it's not, I think inventing a new hstore2 type is probably a lousy way to go. For good or ill, the world has pretty much settled on wanting to use json for lightweight treeish data. That's where we'll get the most impact. Virtually every programming language (including Perl) has good support for json. I'm not sure what the constraints of json that you might want to break are. Perhaps you'd like to specify. Whatever we do, rest assured your work won't go to waste. OK, just to summarize: JSONB and everything it shares with hstore will be in core hstore-specific code stays in contrib hstore contrib will create an hstore type to call contrib and core code 9.4 hstore has some differences from pre-9.4 The question is whether we change/improve hstore in 9.4, or create an hstore2 that is the improved hstore for 9.4 and keep hstore identical to pre-9.4. That last option looks an awful like the dreaded VARCHAR2. What can we do to help people migrate to an hstore type that supports data types? Is there a function we can give them to flag possible problem data, or give them some function to format things the old way for migrations, etc. If they are going to have to rewrite all their old data, why bother with a backward-compatible binary format? Is it only the client applications that will need to be changed? How would we instruct users on the necessary changes? -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
Bruce Momjian br...@momjian.us writes: OK, just to summarize: JSONB and everything it shares with hstore will be in core hstore-specific code stays in contrib hstore contrib will create an hstore type to call contrib and core code 9.4 hstore has some differences from pre-9.4 I've got a problem with the last part of that. AFAICS, the value proposition for hstore2 largely fails if it's not 100% upward compatible with existing hstore, both as to on-disk storage and as to application- visible behavior. If you've got to adapt your application anyway, why not switch to JSONB which is going to offer a lot of benefits in terms of available code you can work with? Although I've not looked at the patch, it was claimed upthread that there were changes in the I/O format for existing test cases, for example. IMO, that's an absolute dead no-go. The question is whether we change/improve hstore in 9.4, or create an hstore2 that is the improved hstore for 9.4 and keep hstore identical to pre-9.4. That last option looks an awful like the dreaded VARCHAR2. I think hstore2 as a separate type isn't likely to be a win either. The bottom line here is that hstore2 is more or less what we'd agreed to doing back at the last PGCon, but that decision has now been obsoleted by events in the JSON area. If jsonb gets in, I think we probably end up rejecting hstore2 as such. Or at least, that's what we should do IMO. contrib/hstore is now a legacy type and we shouldn't be putting additional work into it, especially not work that breaks backwards compatibility. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
Le jeudi 6 mars 2014 09:33:18 Andrew Dunstan a écrit : On 03/06/2014 08:16 AM, Oleg Bartunov wrote: On Thu, Mar 6, 2014 at 12:43 PM, Peter Geoghegan p...@heroku.com wrote: On Thu, Mar 6, 2014 at 12:23 AM, Teodor Sigaev teo...@sigaev.ru wrote: That's possible to introduce GUC variable for i/o functions which will control old bug-to-bug behavior. IMHO, this is much better option that stopping hstore development or split hstore to two branches. A GUC that controls i/o functions is generally considered to be an unacceptable hack. In what sense are we really stopping hstore development if hstore2 lives as jsonb? I have a hard time imagining someone dealing with the incompatibility that a user-facing hstore2 would introduce, while still preferring hstore syntax over json syntax given the choice. There are very rich facilities for manipulating json available in every programming language. The same is not true of hstore. Having looked at the issue today, I think that the amount of redundant code between a hstore2 in core as jsonb and hstore1 will be acceptable. The advantages of making a clean-break in having to support the legacy hstore disk format strengthen the case for doing so too. Heh, let's not to do an implusive decision about hstore2. I agree, that jsonb has a lot of facilities, but don't forget, that json(b) has to follow standard and in that sense it's more constrained than hstore, which we could further develop to support some interesting features, which will never be implemented in json(b). Also, it'd be a bit awkward after working on nested hstore and declaring it on several conferences (Engine Yard has sponsored part of our hstore work), suddenly break people expectation and say, that our work has moved to core to provide json some very cool features, good bye, hstore users :( I'm afraid people will not understand us. Oleg, I hear you, and largely agree, as long as the compatibility issue is solved. If it's not, I think inventing a new hstore2 type is probably a lousy way to go. For good or ill, the world has pretty much settled on wanting to use json for lightweight treeish data. That's where we'll get the most impact. Virtually every programming language (including Perl) has good support for json. I'm not sure what the constraints of json that you might want to break are. Perhaps you'd like to specify. I haven't followed the whole thread, but json is really restrictive on the supported types: a hierarchical hstore could maybe support more types (timestamp comes to mind) as its values, which is not a valid data type in the json spec. Whatever we do, rest assured your work won't go to waste. cheers andrew -- Ronan Dunklau http://dalibo.com - http://dalibo.org signature.asc Description: This is a digitally signed message part.
Re: [HACKERS] jsonb and nested hstore
On Thu, Mar 6, 2014 at 4:25 PM, Tom Lane t...@sss.pgh.pa.us wrote: Bruce Momjian br...@momjian.us writes: OK, just to summarize: JSONB and everything it shares with hstore will be in core hstore-specific code stays in contrib hstore contrib will create an hstore type to call contrib and core code 9.4 hstore has some differences from pre-9.4 I've got a problem with the last part of that. AFAICS, the value proposition for hstore2 largely fails if it's not 100% upward compatible with existing hstore, both as to on-disk storage and as to application- visible behavior. If you've got to adapt your application anyway, why not switch to JSONB which is going to offer a lot of benefits in terms of available code you can work with? Although I've not looked at the patch, it was claimed upthread that there were changes in the I/O format for existing test cases, for example. IMO, that's an absolute dead no-go. The question is whether we change/improve hstore in 9.4, or create an hstore2 that is the improved hstore for 9.4 and keep hstore identical to pre-9.4. That last option looks an awful like the dreaded VARCHAR2. I think hstore2 as a separate type isn't likely to be a win either. The bottom line here is that hstore2 is more or less what we'd agreed to doing back at the last PGCon, but that decision has now been obsoleted by events in the JSON area. If jsonb gets in, I think we probably end up rejecting hstore2 as such. Or at least, that's what we should do IMO. contrib/hstore is now a legacy type and we shouldn't be putting additional work into it, especially not work that breaks backwards compatibility. (not read up on the full details of the thread, sorry if I'm re-iterating something) I think we definitely want/need to maintain hstore compatibility. A completely separate hstore2 type that's not backwards compatible makes very little sense. However, if the new hstore type (compatible with the old one) is the wrapper around jsonb, rather than the other way around, I don't see any problem with it at all. Most future users are almost certainly going to use the json interfaces, but we don't want to leave upgraded users behind. (But of course it has to actually maintain backwards compatibility for that argument to hold) -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
Re: [HACKERS] jsonb and nested hstore
Magnus Hagander mag...@hagander.net writes: However, if the new hstore type (compatible with the old one) is the wrapper around jsonb, rather than the other way around, I don't see any problem with it at all. Most future users are almost certainly going to use the json interfaces, but we don't want to leave upgraded users behind. (But of course it has to actually maintain backwards compatibility for that argument to hold) Yeah --- all of this turns on whether hstore improvements can be 100% upwards compatible or not. If they are, I don't object to including them; I'd have said it was wasted effort, but if the work is already done then that's moot. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On 03/06/2014 10:46 AM, Tom Lane wrote: Magnus Hagander mag...@hagander.net writes: However, if the new hstore type (compatible with the old one) is the wrapper around jsonb, rather than the other way around, I don't see any problem with it at all. Most future users are almost certainly going to use the json interfaces, but we don't want to leave upgraded users behind. (But of course it has to actually maintain backwards compatibility for that argument to hold) Yeah --- all of this turns on whether hstore improvements can be 100% upwards compatible or not. If they are, I don't object to including them; I'd have said it was wasted effort, but if the work is already done then that's moot. Clearly there are people who want it, or else they would not have sponsored the work. We seem to have an emerging consensus on the compatibility issue. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On 03/06/2014 05:46 PM, Tom Lane wrote: Magnus Hagander mag...@hagander.net writes: However, if the new hstore type (compatible with the old one) is the wrapper around jsonb, rather than the other way around, I don't see any problem with it at all. Most future users are almost certainly going to use the json interfaces, but we don't want to leave upgraded users behind. (But of course it has to actually maintain backwards compatibility for that argument to hold) Yeah --- all of this turns on whether hstore improvements can be 100% upwards compatible or not. If they are, I don't object to including them; There are reasons for *not* wanting the new hstore2 functionality. If you don't want nesting, for example, with the new type you're going to need to add a constraint to forbid that. Ugh. Many applications are happy with the current functionality, a simple string key/value dictionary, and for them the new features are not an improvement. As an analogy, adding significant new functionality like nesting to the existing hstore type is like suddenly adding the time of day to the date datatype. It might be useful in many cases. And an existing application can leave the hour and minute fields zero, so it's backwards-compatible. But as soon as someone inserts a datum that uses the hour and minute fields, it will confuse the application that doesn't know about that. I haven't been following these discussions closely, but for those reasons, I thought hstore2 was going to be a separate type. I don't think there are very many applications that would be interested in upgrading from the current hstore to the new hstore2 type. More likely, the new data type is useful for many applications that couldn't have used hstore before because it didn't support nesting or was too loosely typed. And old applications that are already using hstore are perfectly happy with the status quo. Let's not mess with the existing hstore datatype. For what it does, it works great. Likewise, jsonb is significantly different from hstore2, so it should be a separate data type. Frankly I don't understand what the problem is with doing that. I don't have a problem with copy-pasting the common parts. BTW, now that I look at the nested hstore patch, I'm disappointed to see that it only supports a few hardcoded datatypes. Call me naive, but somehow I thought it would support *all* PostgreSQL datatypes, built-in or user-defined. I realize that's a tall order, but that's what I thought it did. Since it doesn't, color me unimpressed. It's really not any better than json, I don't see why anyone would prefer it over json. Not that I particularly like json, but it's a format a lot of people are familiar with. So here my opinion on what we should do: 1. Forget about hstore2 2. Add GIN and GIST operator classes to jsonb, if they're ready for commit pretty darn soon. If not, punt them to next release. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On Thu, Mar 6, 2014 at 11:28 AM, Heikki Linnakangas hlinnakan...@vmware.com wrote: So here my opinion on what we should do: 1. Forget about hstore2 2. Add GIN and GIST operator classes to jsonb, if they're ready for commit pretty darn soon. If not, punt them to next release. For #2, would we maintain the hstore syntax for the searching operators. For example, SELECT count(*) FROM jsonb_schema WHERE tabledata @ 'columns = {{column_name=total_time}}'; Note the hstore-ish = in the searching operator. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
Hi there, Looks like consensus is done. I and Teodor are not happy with it, but what we can do :) One thing I want to do is to reserve our contribution to the flagship feature (jsonb), particularly, binary storage for nested structures and indexing. Their work was sponsored by Engine Yard. As for the old hstore I think it'd be nice to add gin_hstore_hash_ops, so hstore users will benefit from 9.4 release. There is no compatibiliy issue, so I think this could be harmless. Oleg On Thu, Mar 6, 2014 at 7:25 PM, Tom Lane t...@sss.pgh.pa.us wrote: Bruce Momjian br...@momjian.us writes: OK, just to summarize: JSONB and everything it shares with hstore will be in core hstore-specific code stays in contrib hstore contrib will create an hstore type to call contrib and core code 9.4 hstore has some differences from pre-9.4 I've got a problem with the last part of that. AFAICS, the value proposition for hstore2 largely fails if it's not 100% upward compatible with existing hstore, both as to on-disk storage and as to application- visible behavior. If you've got to adapt your application anyway, why not switch to JSONB which is going to offer a lot of benefits in terms of available code you can work with? Although I've not looked at the patch, it was claimed upthread that there were changes in the I/O format for existing test cases, for example. IMO, that's an absolute dead no-go. The question is whether we change/improve hstore in 9.4, or create an hstore2 that is the improved hstore for 9.4 and keep hstore identical to pre-9.4. That last option looks an awful like the dreaded VARCHAR2. I think hstore2 as a separate type isn't likely to be a win either. The bottom line here is that hstore2 is more or less what we'd agreed to doing back at the last PGCon, but that decision has now been obsoleted by events in the JSON area. If jsonb gets in, I think we probably end up rejecting hstore2 as such. Or at least, that's what we should do IMO. contrib/hstore is now a legacy type and we shouldn't be putting additional work into it, especially not work that breaks backwards compatibility. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On 03/06/2014 12:50 PM, Oleg Bartunov wrote: Hi there, Looks like consensus is done. I and Teodor are not happy with it, but what we can do :) One thing I want to do is to reserve our contribution to the flagship feature (jsonb), particularly, binary storage for nested structures and indexing. Their work was sponsored by Engine Yard. We don't normally credit sponsors in commits, but if I'm doing the commit I promise you guys would certainly get major credit as authors. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
I meant in Release Notes for 9.4 On Thu, Mar 6, 2014 at 10:26 PM, Andrew Dunstan and...@dunslane.net wrote: On 03/06/2014 12:50 PM, Oleg Bartunov wrote: Hi there, Looks like consensus is done. I and Teodor are not happy with it, but what we can do :) One thing I want to do is to reserve our contribution to the flagship feature (jsonb), particularly, binary storage for nested structures and indexing. Their work was sponsored by Engine Yard. We don't normally credit sponsors in commits, but if I'm doing the commit I promise you guys would certainly get major credit as authors. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On 03/06/2014 07:00 AM, Bruce Momjian wrote: What can we do to help people migrate to an hstore type that supports data types? Is there a function we can give them to flag possible problem data, or give them some function to format things the old way for migrations, etc. If they are going to have to rewrite all their old data, why bother with a backward-compatible binary format? Is it only the client applications that will need to be changed? How would we instruct users on the necessary changes? So, from what I've been able to check: The actual storage upgrade of hstore--hstore2 is fairly painless from the user perspective; they don't have to do anything. The problem is that the input/output strings are different, something which I didn't think to check for (and Peter did), and which will break applications relying on Hstore, since the drivers which support Hstore (like psycopg2) rely on string-parsing to convert it. I haven't regression-tested hstore2 against psycopg2 since I don't have a good test, but that would be a useful thing to do. Hstore2 supports the same limited data types as JSON does, and not any additional ones. This makes an hstore2 of dubious value unless the compatibility issues can be solved conclusively. Is that all correct? Have I missed something? On 03/06/2014 09:50 AM, Oleg Bartunov wrote: Looks like consensus is done. I and Teodor are not happy with it, but what we can do :) One thing I want to do is to reserve our contribution to the flagship feature (jsonb), particularly, binary storage for nested structures and indexing. Their work was sponsored by Engine Yard. We don't generally credit companies in the release notes, since if we started, where would we stop? However, we *do* credit them in the press release, and I'll make a note of the EY sponsorship, especially since it's also good PR. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On Thu, Mar 6, 2014 at 10:54 PM, Josh Berkus j...@agliodbs.com wrote: g? On 03/06/2014 09:50 AM, Oleg Bartunov wrote: Looks like consensus is done. I and Teodor are not happy with it, but what we can do :) One thing I want to do is to reserve our contribution to the flagship feature (jsonb), particularly, binary storage for nested structures and indexing. Their work was sponsored by Engine Yard. We don't generally credit companies in the release notes, since if we started, where would we stop? However, we *do* credit them in the press release, and I'll make a note of the EY sponsorship, especially since it's also good PR. I think press release is fine. We waited a long time for sponsorship of our work and EY help was crucial. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On Thu, Mar 6, 2014 at 6:54 PM, Josh Berkus j...@agliodbs.com wrote: The actual storage upgrade of hstore--hstore2 is fairly painless from the user perspective; they don't have to do anything. The problem is that the input/output strings are different, something which I didn't think to check for (and Peter did), and which will break applications relying on Hstore, since the drivers which support Hstore (like psycopg2) rely on string-parsing to convert it. I haven't regression-tested hstore2 against psycopg2 since I don't have a good test, but that would be a useful thing to do. Hello, psycopg developer here. Not following the entire thread as it's quite articulated and not of my direct interest (nor comprehension). But if you throw at me a few test cases I can make sure psycopg can parse them much before hstore2 is released. FYI I have a trigger that highlights me the -hackers messages mentioning psycopg, so just mentioning it is enough for me to take a better look. But if you want a more active collaboration just ask. Thank you, -- Daniele -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On Thu, Mar 6, 2014 at 12:58 PM, Daniele Varrazzo daniele.varra...@gmail.com wrote: Hello, psycopg developer here. Not following the entire thread as it's quite articulated and not of my direct interest (nor comprehension). But if you throw at me a few test cases I can make sure psycopg can parse them much before hstore2 is released. I don't think that'll be necessary. Any break in compatibility in the hstore format has been ruled a non-starter for having hstore support nested data structures. I believe on balance we're content to let hstore continue to be hstore. jsonb support would certainly be interesting, though. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On Thu, Mar 6, 2014 at 9:10 PM, Peter Geoghegan p...@heroku.com wrote: On Thu, Mar 6, 2014 at 12:58 PM, Daniele Varrazzo daniele.varra...@gmail.com wrote: Hello, psycopg developer here. Not following the entire thread as it's quite articulated and not of my direct interest (nor comprehension). But if you throw at me a few test cases I can make sure psycopg can parse them much before hstore2 is released. I don't think that'll be necessary. Any break in compatibility in the hstore format has been ruled a non-starter for having hstore support nested data structures. I believe on balance we're content to let hstore continue to be hstore. jsonb support would certainly be interesting, though. Cool, just let me know what you would expect a well-behaved client library to behave. -- Daniele -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers