Re: [PERFORM] array size exceeds the maximum allowed (1073741823) when building a json
On Wed, Jun 8, 2016 at 1:04 AM, Tom Lane wrote: > Michael Paquier writes: >> On Tue, Jun 7, 2016 at 10:03 PM, Josh Berkus wrote: >>> On 06/07/2016 08:42 AM, Nicolas Paris wrote: Will this 1GO restriction is supposed to increase in a near future ? > >>> Not planned, no. Thing is, that's the limit for a field in general, not >>> just JSON; changing it would be a fairly large patch. It's desireable, >>> but AFAIK nobody is working on it. > >> And there are other things to consider on top of that, like the >> maximum allocation size for palloc, the maximum query string size, >> COPY, etc. This is no small project, and the potential side-effects >> should not be underestimated. > > It's also fair to doubt that client-side code would "just work" with > no functionality or performance problems for such large values. > > I await with interest the OP's results on other JSON processors that > have no issues with GB-sized JSON strings. Yup. Most json libraries and tools are going to be disgusting memory hogs or have exponential behaviors especially when you consider you are doing the transformation as well. Just prettifying json documents over 1GB can be a real challenge. Fortunately the workaround here is pretty easy. Keep your query exactly as is but remove the final aggregation step so that it returns a set. Next, make a small application that runs this query and does the array bits around each row (basically prepending the final result with [ appending the final result with ] and putting , between rows). It's essential that you use a client library that does not buffer the entire result in memory before emitting results. This can be done in psql (FETCH mode), java, libpq (single row mode), etc. I suspect node.js pg module can do this as well, and there certainty will be others. The basic objective is you want the rows to be streamed out of the database without being buffered. If you do that, you should be able to stream arbitrarily large datasets out of the database to a json document assuming the server can produce the query. merlin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] array size exceeds the maximum allowed (1073741823) when building a json
2016-06-09 15:31 GMT+02:00 Merlin Moncure : > On Wed, Jun 8, 2016 at 1:04 AM, Tom Lane wrote: > > Michael Paquier writes: > >> On Tue, Jun 7, 2016 at 10:03 PM, Josh Berkus wrote: > >>> On 06/07/2016 08:42 AM, Nicolas Paris wrote: > Will this 1GO restriction is supposed to increase in a near future ? > > > >>> Not planned, no. Thing is, that's the limit for a field in general, > not > >>> just JSON; changing it would be a fairly large patch. It's desireable, > >>> but AFAIK nobody is working on it. > > > >> And there are other things to consider on top of that, like the > >> maximum allocation size for palloc, the maximum query string size, > >> COPY, etc. This is no small project, and the potential side-effects > >> should not be underestimated. > > > > It's also fair to doubt that client-side code would "just work" with > > no functionality or performance problems for such large values. > > > > I await with interest the OP's results on other JSON processors that > > have no issues with GB-sized JSON strings. > > Yup. Most json libraries and tools are going to be disgusting memory > hogs or have exponential behaviors especially when you consider you > are doing the transformation as well. Just prettifying json documents > over 1GB can be a real challenge. > > Fortunately the workaround here is pretty easy. Keep your query > exactly as is but remove the final aggregation step so that it returns > a set. Next, make a small application that runs this query and does > the array bits around each row (basically prepending the final result > with [ appending the final result with ] and putting , between rows). > The point is when prepending/appending leads to deal with strings. Transforming each value of the resultset to a string implies to escape the double quote. then: row1 contains {"hello":"world"} step 1 = prepend -> "[{\"hello\":\"world\"}" step 2 = append -> "[{\"hello\":\"world\"}," and so on the json is corrupted. Hopelly I am sure I am on a wrong way about that. > It's essential that you use a client library that does not buffer the > entire result in memory before emitting results. This can be done in > psql (FETCH mode), java, libpq (single row mode), etc. I suspect > node.js pg module can do this as well, and there certainty will be > others. > > The basic objective is you want the rows to be streamed out of the > database without being buffered. If you do that, you should be able > to stream arbitrarily large datasets out of the database to a json > document assuming the server can produce the query. > > merlin >
Re: [PERFORM] slony rpm help slony1-95-2.2.2-1.rhel6.x86_64
> From: avi Singh >To: pgsql-performance@postgresql.org >Sent: Saturday, 4 June 2016, 0:03 >Subject: [PERFORM] slony rpm help slony1-95-2.2.2-1.rhel6.x86_64 > > > >Hi All > Can anyone please point me to location from where i can get slony > slony1-95-2.2.2-1.rhel5.x86_64 rpm. I'm upgrading database from version 9.3 > to 9.5. Current version of rpm we are using is slony1-93-2.2.2-1.el5.x86_64 > and the one that is available on postgresql website for 9.5 is > slony1-95-2.2.4-4.rhel5.x86_64 which is not compatible and throws an error > when i test the upgrade. In the past i was able to find the 2.2.2-1 version > rpm for previous versions on postgres website but not this time for > postgresql 9.5 > > What you'd be better off doing is installing Slony 2.2.4 on all your servers (or better a 2.2.5) rather than trying to get the older version. If you can't get a package you could compile Slony yourself. The not compatible error you mention is most likely because you've failed to update the Slony functions. See: http://slony.info/documentation/2.2/stmtupdatefunctions.html Glyn -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Performance of LIKE/NOT LIKE when used in single query
On Tue, Jun 7, 2016 at 9:57 PM, Ed Felstein wrote: > Hello, > First time poster here. Bear with me. > Using PostgreSQL 9.5 > I have a situation where I have a LIKE and a NOT LIKE in the same query to > identify strings in a varchar field. Since I am using wildcards, I have > created a GIN index on the field in question, which makes LIKE '%%' > searches run very fast. The problem is the NOT LIKE phrases, which (as > would be expected) force a sequential scan. Being that we're talking about > millions of records, this is not desirable. > Here's the question... > Is there a way, using a single query, to emulate the process of running the > LIKE part first, then running the NOT LIKE just on those results? Just do it. In my hands, the planner is smart enough to figure it out for itself. explain analyze select * from stuff where synonym like '%BAT%' and synonym not like '%col not like%' ; QUERY PLAN - Bitmap Heap Scan on stuff (cost=16.10..63.08 rows=13 width=14) (actual time=9.465..10.642 rows=23 loops=1) Recheck Cond: (synonym ~~ '%BAT%'::text) Rows Removed by Index Recheck: 76 Filter: (synonym !~~ '%col not like%'::text) Heap Blocks: exact=57 -> Bitmap Index Scan on integrity_synonym_synonym_idx (cost=0.00..16.10 rows=13 width=0) (actual time=8.847..8.847 rows=99 loops=1) Index Cond: (synonym ~~ '%BAT%'::text) Planning time: 18.261 ms Execution time: 10.932 ms So it is using the index for the positive match, and filtering those results for the negative match, just as you wanted. Cheers, Jeff -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] array size exceeds the maximum allowed (1073741823) when building a json
On Thu, Jun 9, 2016 at 8:43 AM, Nicolas Paris wrote: > > > 2016-06-09 15:31 GMT+02:00 Merlin Moncure : >> >> On Wed, Jun 8, 2016 at 1:04 AM, Tom Lane wrote: >> > Michael Paquier writes: >> >> On Tue, Jun 7, 2016 at 10:03 PM, Josh Berkus wrote: >> >>> On 06/07/2016 08:42 AM, Nicolas Paris wrote: >> Will this 1GO restriction is supposed to increase in a near future ? >> > >> >>> Not planned, no. Thing is, that's the limit for a field in general, >> >>> not >> >>> just JSON; changing it would be a fairly large patch. It's >> >>> desireable, >> >>> but AFAIK nobody is working on it. >> > >> >> And there are other things to consider on top of that, like the >> >> maximum allocation size for palloc, the maximum query string size, >> >> COPY, etc. This is no small project, and the potential side-effects >> >> should not be underestimated. >> > >> > It's also fair to doubt that client-side code would "just work" with >> > no functionality or performance problems for such large values. >> > >> > I await with interest the OP's results on other JSON processors that >> > have no issues with GB-sized JSON strings. >> >> Yup. Most json libraries and tools are going to be disgusting memory >> hogs or have exponential behaviors especially when you consider you >> are doing the transformation as well. Just prettifying json documents >> over 1GB can be a real challenge. >> >> Fortunately the workaround here is pretty easy. Keep your query >> exactly as is but remove the final aggregation step so that it returns >> a set. Next, make a small application that runs this query and does >> the array bits around each row (basically prepending the final result >> with [ appending the final result with ] and putting , between rows). > > > The point is when prepending/appending leads to deal with strings. > Transforming each value of the resultset to a string implies to escape the > double quote. > then: > row1 contains {"hello":"world"} > step 1 = prepend -> "[{\"hello\":\"world\"}" > step 2 = append -> "[{\"hello\":\"world\"}," right 3 rows contain {"hello":"world"} before iteration: emit '[' before every row except the first, prepend ',' after iteration: emit ']' you end up with: [{"hello":"world"} ,{"hello":"world"} ,{"hello":"world"}] ...which is 100% valid json as long as each row of the set is a json object. in SQL, the technique is like this: select ('[' || string_agg(j::text, ',') || ']')::json from (select json_build_object('hello', 'world') j from generate_series(1,3)) q; the difference is, instead of having the database do the string_agg step, it's handled on the client during iteration over the output of generate_series. merlin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] 9.4 -> 9.5 regression with queries through pgbouncer on RHEL 6
Hi, On 2016-06-02 14:18:26 +0300, Антон Бушмелев wrote: > UP. repeat tests on local vm.. reults are discouraging > OSPG TPS AVG latency > Centos 7 9.5.3 23.711023 168.421 > Centos 7 9.5.3 26.609271 150.188 > Centos 7 9.5.3 25.220044 158.416 > Centos 7 9.5.3 25.598977 156.047 > Centos 7 9.4.8 278.572191 14.077 > Centos 7 9.4.8 247.237755 16.177 > Centos 7 9.4.8 240.007524 16.276 > Centos 7 9.4.8 237.862238 16.596 Could you provide profiles on 9.4 and 9.5? Which kernel did you have enabled? Is /proc/sys/kernel/sched_autogroup_enabled 1 or 0? Regards, Andres -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance