Re: [HACKERS] FDW and parallel execution
hello ... did you check out antonin houska's patches? we basically got code, which can do that. many thanks, hans On 04/02/2017 03:30 PM, Konstantin Knizhnik wrote: > Hi hackers and personally Robet (you are the best expert in both areas). > I want to ask one more question concerning parallel execution and FDW. > Below are two plans for the same query (TPC-H Q5): one for normal > tables, another for FDW to vertical representation of the same data. > FDW supports analyze function and is expected to produce the similar > statistic as for original tables. > > Query plans are the following: > > Normal tables: > > Sort (cost=2041588.48..2041588.98 rows=200 width=48) >Sort Key: (sum((lineitem.l_extendedprice * ('1'::double precision - > lineitem.l_discount DESC >-> Finalize GroupAggregate (cost=2041335.76..2041580.83 rows=200 > width=48) > Group Key: nation.n_name > -> Gather Merge (cost=2041335.76..2041568.33 rows=1400 > width=48) >Workers Planned: 7 >-> Partial GroupAggregate > (cost=2040335.64..2040396.71 rows=200 width=48) > Group Key: nation.n_name > -> Sort (cost=2040335.64..2040345.49 rows=3938 > width=40) >Sort Key: nation.n_name >-> Hash Join (cost=605052.97..2040100.48 > rows=3938 width=40) > Hash Cond: ((orders.o_custkey = > customer.c_custkey) AND (nation.n_nationkey = customer.c_nationkey)) > -> Hash Join > (cost=525126.37..1951647.85 rows=98414 width=52) >Hash Cond: (lineitem.l_orderkey > = orders.o_orderkey) >-> Hash Join > (cost=3802.22..1404473.37 rows=654240 width=52) > Hash Cond: > (lineitem.l_suppkey = supplier.s_suppkey) > -> Parallel Seq Scan on > lineitem (cost=0.00..1361993.36 rows=8569436 width=16) > -> Hash > (cost=3705.97..3705.97 rows=7700 width=44) >-> Hash Join > (cost=40.97..3705.97 rows=7700 width=44) > Hash Cond: > (supplier.s_nationkey = nation.n_nationkey) > -> Seq Scan > on supplier (cost=0.00..3090.00 rows=10 width=8) > -> Hash > (cost=40.79..40.79 rows=15 width=36) >-> > Hash Join (cost=20.05..40.79 rows=15 width=36) > > Hash Cond: (nation.n_regionkey = region.r_regionkey) > > -> Seq Scan on nation (cost=0.00..17.70 rows=770 width=40) > > -> Hash (cost=20.00..20.00 rows=4 width=4) > > -> Seq Scan on region (cost=0.00..20.00 rows=4 width=4) > > > Filter: ((r_name)::text = 'ASIA'::text) >-> Hash > (cost=484302.37..484302.37 rows=2256542 width=8) > -> Seq Scan on orders > (cost=0.00..484302.37 rows=2256542 width=8) >Filter: > ((o_orderdate >= '1996-01-01'::date) AND (o_orderdate < > '1997-01-01'::date)) > -> Hash (cost=51569.64..51569.64 > rows=1499864 width=8) >-> Seq Scan on customer > (cost=0.00..51569.64 rows=1499864 width=8) > > > Plan with FDW: > > Sort (cost=2337312.28..2337312.78 rows=200 width=48) >Sort Key: (sum((lineitem_fdw.l_extendedprice * ('1'::double > precision - lineitem_fdw.l_discount DESC >-> GroupAggregate (cost=2336881.54..2337304.64 rows=200 width=48) > Group Key: nation.n_name > -> Sort (cost=2336881.54..2336951.73 rows=28073 width=40) >Sort Key: nation.n_name >-> Hash Join (cost=396050.65..2334807.39 rows=28073 > width=40) > Hash Cond: ((orders_fdw.o_custkey = > customer_fdw.c_custkey) AND (nation.n_nationkey = > customer_fdw.c_nationkey)) > -> Hash Join (cost=335084.53..2247223.46 > rows=701672 width=52) >Hash Cond: (lineitem_fdw.l_orderkey = > orders_fdw.o_orderkey) >-> Hash Join (cost=2887.07..1786058.18 > rows=4607421 width=52) > Hash Cond: (lineitem_fdw.l_suppkey = > supplier_fdw.s_suppkey) >
Re: [HACKERS] remove checkpoint_warning
On 07/09/2016 11:12 PM, Tom Lane wrote: Alvaro Herrera writes: the checkpoint_warning feature was added by commit 2986aa6a668bce3cfb836 in November 2002 when we didn't have any logging of checkpointing at all. I propose to remove it: surely anyone who cares about analyzing checkpointing behavior nowadays is using the log_checkpoint feature instead, which contains much more detail. The other one is just noise now, and probably ignored amidst the number of other warning traffic. Hmm, not sure. ISTM log_checkpoint is oriented to people who know what they are doing, whereas checkpoint_warning is more targeted to trying to help people who don't. Perhaps you could make an argument that checkpoint_warning is useless because the people whom it's meant to help won't notice the warning anyway --- but I doubt that it's been "superseded" by log_checkpoint, because the latter would only be enabled by people who already have a clue that checkpoint performance is something to worry about. Or in short, this may be a fine change to make, but I don't like your argument for it. regards, tom lane i think tom is right here. log_checkpoint and checkpoint_warning are for totally different people. we might just want to do one thing: we might want to state explicitly that the database cannot break down if this warning shows up. many people are scared to death that this warning somehow indicates that PostgreSQL is about to go up in flames, which is of course not true. maybe we could do "consider increasing to ensure good performance" or so ... regards, hans -- Hans-Jürgen Schönig Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt Web: http://www.postgresql-support.de, http://www.cybertec.at -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Data at rest encryption
On 06/14/2016 09:59 PM, Jim Nasby wrote: On 6/12/16 2:13 AM, Ants Aasma wrote: On Fri, Jun 10, 2016 at 5:23 AM, Haribabu Kommi wrote: > 1. Instead of doing the entire database files encryption, how about > providing user an option to protect only some particular tables that > wants the encryption at table/tablespace level. This not only provides > an option to the user, it reduces the performance impact on tables > that doesn't need any encryption. The problem with this approach > is that every xlog record needs to validate to handle the encryption > /decryption, instead of at page level. Is there a real need for this? The customers I have talked to want to encrypt the whole database and my goal is to make the feature fast enough to make that feasible for pretty much everyone. I guess switching encryption off per table would be feasible, but the key setup would still need to be done at server startup. Per record encryption would result in some additional information leakage though. Overall I thought it would not be worth it, but I'm willing to have my mind changed on this. I actually design with this in mind. Tables that contain sensitive info go into designated schemas, partly so that you can blanket move all of those to an encrypted tablespace (or safer would be to move things not in those schemas to an unencrypted tablespace). Since that can be done with an encrypted filesystem maybe that's good enough. (It's not really clear to me what this buys us over an encrypted FS, other than a feature comparison checkmark...) the reason why this is needed is actually very simple: security guidelines and legal requirements ... we have dealt with a couple of companies recently, who explicitly demanded PostgreSQL level encryption in a transparent way to fulfill some internal or legal requirements. this is especially true for financial stuff. and yes, sure ... you can do a lot of stuff with filesystem encryption. the core idea of this entire thing is however to have a counterpart on the database level. if you don't have the key you cannot start the instance and if you happen to get access to the filesystem you are still not able to fire up the DB. as it said: requirements by ever bigger companies. as far as benchmarking is concerned: i did a quick test yesterday (not with the final AES implementation yet) and i got pretty good results. with a reasonably well cached database in a typical application I expect to loose around 10-20%. if everything fits in memory there is 0 loss of course. the worst I got with the standard AES (no hardware support used yet) I lost around 45% or so. but this requires a value as low as 32 MB of shared buffers or so. many thanks, hans -- Hans-Jürgen Schönig Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt Web: http://www.postgresql-support.de, http://www.cybertec.at -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: Implement failover on libpq connect level.
> On 18 Aug 2015, at 11:19, Albe Laurenz wrote: > > Hans-Jürgen Schönig wrote: >> in addition to that you have the “problem” of transactions. if you failover >> in the middle >> of a transaction, strange things might happen from the application point of >> view. >> >> the good thing, however, is that stupid middleware is sometimes not able to >> handle >> failed connections. however, overall i think it is more of a danger than a >> benefit. > > Maybe I misunderstood the original proposal, but my impression was that the > alternative > servers would be tried only at the time the connection is established, and > there would be no > such problems as you describe. it would still leave the problem of having a read only on the other side unless you are using BDR or so. regards, hans -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: Implement failover on libpq connect level.
> On 18 Aug 2015, at 10:32, Albe Laurenz wrote: > > Victor Wagner wrote: >> Rationale >> = >> >> Since introduction of the WAL-based replication into the PostgreSQL, it is >> possible to create high-availability and load-balancing clusters. >> >> However, there is no support for failover in the client libraries. So, only >> way to provide transparent for client application failover is IP address >> migration. This approach has some limitation, i.e. it requires that >> master and backup servers reside in the same subnet or may not be >> feasible for other reasons. >> >> Commercial RDBMS, such as Oracle, employ more flexible approach. They >> allow to specify multiple servers in the connect string, so if primary >> server is not available, client library tries to connect to other ones. >> >> This approach allows to use geographically distributed failover clusters >> and also is a cheap way to implement load-balancing (which is not >> possible with IP address migration). > > I wonder how useful this is at the present time. > > If the primary goes down and the client gets connected to the standby, > it would have read-only access there. Most applications wouldn't cope > well with that. > > Once we have multi-master replication that can be used for fail-over, > the picture will change. Then a feature like that would be very useful > indeed. > >>"host=main-server host=standby1 host=standby2 port=5432 dbname=database" > > It seems a bit arbitrary to require that all servers use the same port. > > Maybe parameters like host2, port2, host3, port3 etc. might be better. > > Yours, > Laurenz Albe i totally agree with laurenz. in addition to that you have the “problem” of transactions. if you failover in the middle of a transaction, strange things might happen from the application point of view. the good thing, however, is that stupid middleware is sometimes not able to handle failed connections. however, overall i think it is more of a danger than a benefit. regards, hans > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Small GIN optimizations (after 9.4)
i think there is one more thing which would be really good in GIN and which would solve a ton of issues. atm GIN entries are sorted by item pointer. if we could sort them by a "column" it would fix a couple of real work issues such as ... SELECT ... FROM foo WHERE "tsearch_query" ORDER BY price DESC LIMIT 10 ... or so. it many cases you want to search for a, say, product and find the cheapest / most expensive one. if the tsearch_query yields a high number of rows (which it often does) the subsequent sort will kill you. many thanks, hans On Feb 6, 2014, at 12:36 PM, Heikki Linnakangas wrote: > While hacking on the GIN patches, I've come up with a few different ideas for > improving performance. It's too late for 9.4, but I'll list them here if > someone wants to work on them later: > > * Represent ItemPointers as uint64's, to speed up comparisons. > ginCompareItemPointers is inlined into only a few instructions, but it's > still more expensive than a single-instruction 64-bit comparison. > ginCompareItemPointers is called very heavily in a GIN scan, so even a small > improvement there would make for a noticeable speedup. It might be an > improvement in code clarity, too. > > * Keep the entry streams of a GinScanKey in a binary heap, to quickly find > the minimum curItem among them. > > I did this in various versions of the fast scan patch, but then I realized > that the straightforward way of doing it is wrong, because a single > GinScanEntry can be part of multiple GinScanKeys. If an entry's curItem is > updated as part of advancing one key, and the entry is in a heap of another > key, updating the curItem can violate the heap property of the other entry's > heap. > > * Build a truth table (or cache) of consistent-function's results, and use > that instead of calling consistent for every item. > > * Deduce AND or OR logic from the consistent function. Or have the opclass > provide a tree of AND/OR/NOT nodes directly, instead of a consistent > function. For example, if the query is "foo & bar", we could avoid calling > consistent function altogether, and only return items that match both. > > * Delay decoding segments during a scan. Currently, we decode all segments of > a posting tree page into a single array at once. But with "fast scan", we > might be able to skip over all entries in some of the segments. So it would > be better to copy the segments into backend-private memory in compressed > format, and decode them one segment at a time (or maybe even one item at a > time), when needed. That would avoid the unnecessary decoding of segments > that can be skipped over, and would also reduce memory usage of a scan. > > I'll add these to the TODO. > > - Heikki > > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers > -- Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Backup throttling
On Aug 21, 2013, at 10:57 AM, Andres Freund wrote: > On 2013-08-21 08:10:42 +0200, PostgreSQL - Hans-Jürgen Schönig wrote: >> >> On Aug 19, 2013, at 9:11 PM, Andres Freund wrote: >> >>> On 2013-08-19 20:15:51 +0200, Boszormenyi Zoltan wrote: >>>> 2013-08-19 19:20 keltezéssel, Andres Freund írta: >>>>> Hi, >>>>> >>>>> On 2013-07-24 09:20:52 +0200, Antonin Houska wrote: >>>>>> Hello, >>>>>> the purpose of this patch is to limit impact of pg_backup on running >>>>>> server. >>>>>> Feedback is appreciated. >>>>> Based on a quick look it seems like you're throttling on the receiving >>>>> side. Is that a good idea? Especially over longer latency links, TCP >>>>> buffering will reduce the effect on the sender side considerably. >>> >>>> Throttling on the sender side requires extending the syntax of >>>> BASE_BACKUP and maybe START_REPLICATION so both can be >>>> throttled but throttling is still initiated by the receiver side. >>> >>> Seems fine to me. Under the premise that the idea is decided to be >>> worthwile to be integrated. Which I am not yet convinced of. >> >> i think there is a lot of value for this one. the scenario we had a couple >> of times is pretty simple: >> just assume a weak server - maybe just one disk or two - and a slave. >> master and slave are connected via a 1 GB network. >> pg_basebackup will fetch data full speed basically putting those lonely >> disks out of business. >> we actually had a case where a client asked if "PostgreSQL is locked during >> base backup". of >> course it was just disk wait caused by a full speed pg_basebackup. > >> regarding the client side implementation: we have chosen this way because it >> is less invasive. >> i cannot see a reason to do this on the server side because we won't have 10 >> pg_basebackup-style tools making use of this feature anyway. > > The problem is that receiver side throttling over TCP doesn't always > work all that nicely unless you have a low rate of transfer and/or very > low latency . Quite often you will have OS buffers/the TCP Window being > filled in bursts where the sender sends at max capacity and then a > period where nothing happens on the sender. That's often not what you > want when you need to throttle. > > Besides, I can see some value in e.g. normal streaming replication also > being rate limited... > what would be a reasonable scenario where limiting streaming would make sense? i cannot think of any to be honest. regards, hans -- Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Backup throttling
On Aug 19, 2013, at 9:11 PM, Andres Freund wrote: > On 2013-08-19 20:15:51 +0200, Boszormenyi Zoltan wrote: >> 2013-08-19 19:20 keltezéssel, Andres Freund írta: >>> Hi, >>> >>> On 2013-07-24 09:20:52 +0200, Antonin Houska wrote: Hello, the purpose of this patch is to limit impact of pg_backup on running server. Feedback is appreciated. >>> Based on a quick look it seems like you're throttling on the receiving >>> side. Is that a good idea? Especially over longer latency links, TCP >>> buffering will reduce the effect on the sender side considerably. > >> Throttling on the sender side requires extending the syntax of >> BASE_BACKUP and maybe START_REPLICATION so both can be >> throttled but throttling is still initiated by the receiver side. > > Seems fine to me. Under the premise that the idea is decided to be > worthwile to be integrated. Which I am not yet convinced of. i think there is a lot of value for this one. the scenario we had a couple of times is pretty simple: just assume a weak server - maybe just one disk or two - and a slave. master and slave are connected via a 1 GB network. pg_basebackup will fetch data full speed basically putting those lonely disks out of business. we actually had a case where a client asked if "PostgreSQL is locked during base backup". of course it was just disk wait caused by a full speed pg_basebackup. regarding the client side implementation: we have chosen this way because it is less invasive. i cannot see a reason to do this on the server side because we won't have 10 pg_basebackup-style tools making use of this feature anyway. of course, if you got 20 disk and a 1 gbit network this is useless - but many people don't have that. regards, hans-jürgen schönig -- Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Combine non-recursive and recursive CTEs?
On Jun 16, 2012, at 8:27 AM, Magnus Hagander wrote: > I'm not sure if this is something I don't know how to do, or if it's > something we simply can't do, or if it's something we could do but the > syntax can't handle :-) > > Basically, I'd like to combine a recursive and a non-recursive CTE in > the same query. If I do it non-recursive, I can do something like: > > WITH t1(z) AS ( > SELECT a FROM x > ), > t2 AS ( > SELECT z FROM t1 > ) > SELECT * FROM t2; > > > But what if I want t2 to be recursive? > > Trying something like: > WITH t1 (z,b) AS ( > SELECT a,b FROM x > ), > RECURSIVE t2(z,b) AS ( > SELECT z,b FROM t1 WHERE b IS NULL > UNION ALL > SELECT z,b FROM t2 INNER JOIN t1 ON t2.b=t1.z > ) > > I get a syntax error on the RECURSIVE. > > Is there any other position in this query that I can put the RECURSIVE > in order for it to get through? > > -- > Magnus Hagander > Me: http://www.hagander.net/ > Work: http://www.redpill-linpro.com/ > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers > hm, this is interesting ... cat /tmp/a.sql WITHy AS ( SELECT 1 AS n), g AS (WITH RECURSIVE x(n) AS ( SELECT (SELECT n FROM y) AS n UNION ALL SELECT n + 1 AS n FROM x WHERE n < 10)) SELECT * FROM g; Hans-Jurgen-Scbonigs-MacBook-Pro:sql hs$ psql test < /tmp/a.sql ERROR: syntax error at or near ")" LINE 8: WHERE n < 10)) this gives a syntax error as well ... if my early morning brain is correct this should be a proper statement ... regards, hans -- Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Getting rid of cheap-startup-cost paths earlier
On May 22, 2012, at 9:57 AM, Simon Riggs wrote: > On 22 May 2012 06:50, Tom Lane wrote: > >> Currently, the planner keeps paths that appear to win on the grounds of >> either cheapest startup cost or cheapest total cost. It suddenly struck >> me that in many simple cases (viz, those with no LIMIT, EXISTS, cursor >> fast-start preference, etc) we could know a-priori that cheapest startup >> cost is not going to be interesting, and hence immediately discard any >> path that doesn't win on total cost. > > My experience is that most people don't provide a LIMIT explicitly > even when they know that's the desired behaviour. That's because > either they simply don't understand that SQL can return lots of rows, > or SQL knowledge isn't enough, or worse that people don't even know > that specifying it would alter query plans. > > Regrettably the current planning of LIMIT clauses causes more problems > so in many cases these have been explicitly removed from SQL by > developers that know how many rows they wish to see. > > I would have proposed a default-LIMIT parameter before now, but for > that last point. this sounds like a total disaster to me ... why in the world should we have a default LIMIT parameter? i guess if somebody is not able to use LIMIT he should better not touch the DB. we clearly cannot fix incompetence by adding parameters. regards, hans -- Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] index-only scans
On Oct 7, 2011, at 8:47 PM, Joshua D. Drake wrote: > > On 10/07/2011 11:40 AM, Tom Lane wrote: >> Robert Haas writes: >>> Please find attached a patch implementing a basic version of >>> index-only scans. >> >> I'm making some progress with this, but I notice what seems like a >> missing feature: there needs to be a way to turn it off. Otherwise >> performance comparisons will be difficult to impossible. >> >> The most obvious solution is a planner control GUC, perhaps >> "enable_indexonlyscan". Anyone object, or want to bikeshed the name? > > enable_onlyindexscan > > I'm kidding. > > +1 on Tom's proposed name. +1 ... definitely an important thing to do. regards, hans -- Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] What is known about PostgreSQL HP-UX support?
On Oct 10, 2011, at 4:21 AM, Alex Goncharov wrote: > [ Thanks all for the very productive discussion in the thread > "libpq, PQdescribePrepared -> PQftype, PQfmod, no PQnullable" > which I originated. Very useful. Now on something different. ] > > About two years ago, I had to research some PostgreSQL failures on > HP-UX on a lame PA-RISC box. Looking at the PostgreSQL source code > then, I got an impression that running PostgreSQL on HP-UX was an open > question -- HP-UX didn't seem like a seriously targeted platform. > > Was I wrong in my assessment? Does anybody have a good experience > running PostgreSQL on HP-UX? What version of both? PA-RISC? IA64? > > Thanks, hello, HPUX is not too common these days but it works like a charm. we have a couple of large IA64 servers running on HPUX at a major customer. things work without any problems. it compiled out of the box just like expected and we have not seen any failures or so for almost 2 years now. so, thumbs up ... nothing to be afraid of. many thanks, hans -- Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CUDA Sorting
On Sep 19, 2011, at 5:16 PM, Tom Lane wrote: > Greg Stark writes: >> That said, to help in the case I described you would have to implement >> the tapesort algorithm on the GPU as well. > > I think the real problem would be that we are seldom sorting just the > key values. If you have to push the tuples through the GPU too, your > savings are going to go up in smoke pretty quickly … > i would argument along a similar line. to make GPU code fast it has to be pretty much tailored to do exactly one thing - otherwise you have no chance to get anywhere close to card-bandwith. if you look at "two similar" GPU codes which seem to do the same thing you might easily see that one is 10 times faster than the other - for bloody reason such as memory alignment, memory transaction size or whatever. this opens a bit of a problem: PostgreSQL sorting is so generic and so flexible that i would be really surprised if somebody could come up with a solution which really comes close to what the GPU can do. it would definitely be interesting to see a prototype, however. btw, there is a handful of interesting talks / lectures about GPU programming provided by the university of chicago (just cannot find the link atm). regards, hans -- Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] help with plug-in function for additional (partition/shard) visibility checks
On Sep 2, 2011, at 2:59 PM, Hannu Krosing wrote: > On Fri, 2011-09-02 at 14:51 +0200, PostgreSQL - Hans-Jürgen Schönig > wrote: >> hello … >> >> the goal of the entire proxy thing is to make the right query go to the >> right node / nodes. >> we determine this by using a partitioning function and so on … >> currently PL/proxy has only a handful of commands - one is RUN ON … which >> tells us where to put things. >> assume you issue a select … some select will "fall out" on the target node. >> to restrict the data coming from the node you could add an additional >> constraint on the way … >> >> say: >> SELECT * FROM proxy_table WHERE a = 20; >> >> what you want to reach the node after a split is … >> >> SELECT * FROM proxy_table WHERE a = 20 AND col = "filter the wrong half >> away" >> >> my idea is to add an additional command to the PL/proxy command set. >> it should call a function generating this additional filter. >> maybe somehow like that … >> >> RUN ON hashtext($1) >> -- this one already knows about the increased cluster >> GENERATE FILTER my_create_the_bloody_filter_func($1)-- this one >> would "massage" the query going to the node. >> >> it would actually open the door for a lot of additional trickery. >> the function would tell the proxy what to append - and: this "what" would be >> under your full control. >> >> what do you think? > > Hmm, could work for simplest cases, but this has 2 main problems: > > 1) you need a full SQL parser to make this generally useful for plain > SQL i think that everything beyond a simple case is pretty hard to achieve anyway. to me it looks pretty impossible to solve this in a generic way without same insane amount of labor input - at listen given the ideas coming to me in the past. and yes, functions are an issue. unless you have some sort of "virtually private database" thing it is close to impossible (unless you want to try some nightmare based on views / constraint exclusion on the partitions or so). regards, hans -- Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] help with plug-in function for additional (partition/shard) visibility checks
hello … the goal of the entire proxy thing is to make the right query go to the right node / nodes. we determine this by using a partitioning function and so on … currently PL/proxy has only a handful of commands - one is RUN ON … which tells us where to put things. assume you issue a select … some select will "fall out" on the target node. to restrict the data coming from the node you could add an additional constraint on the way … say: SELECT * FROM proxy_table WHERE a = 20; what you want to reach the node after a split is … SELECT * FROM proxy_table WHERE a = 20 AND col = "filter the wrong half away" my idea is to add an additional command to the PL/proxy command set. it should call a function generating this additional filter. maybe somehow like that … RUN ON hashtext($1) -- this one already knows about the increased cluster GENERATE FILTER my_create_the_bloody_filter_func($1)-- this one would "massage" the query going to the node. it would actually open the door for a lot of additional trickery. the function would tell the proxy what to append - and: this "what" would be under your full control. what do you think? i got to think about it futher but i can envision that this could be feasible ... hans On Sep 2, 2011, at 2:36 PM, Hannu Krosing wrote: > On Fri, 2011-09-02 at 14:01 +0200, PostgreSQL - Hans-Jürgen Schönig > wrote: >> hello … >> >> i have been thinking about this issue for quite a while ... >> given your idea i am not sure how this can work at all. >> >> consider: >> begin; >> insert 1 >> insert 2 >> commit >> >> assume this ends up in the same node, >> now you split it into two … >> 1 and 2 will have exactly the same visibility to and transaction. >> i am not sure how you can get this right without looking at the data. > > It has to consider the data when determining visibility, that's the > whole point of the plug-in . > > The idea is, that each row "belongs" to a certain partition, as > determined by some function over it's fields. Most often this function > is hash of primary key OR-ed by a bitmap representing cluster size and > AND-ed by bitmap for partition(s) stored in this database. > > when you split the parition, then some row's don't belong in the old > partition database anymore (and if you did a full copy, then the other > half dont belong to the new one), so they should be handled as > invisible / deleted. As this can be only done by looking at the tuple > data, this needs an additional visibility function. And as this is only > needed for partitioned databases, it makes sense to implement it as a > plogin, so it would not wast cycles on non-partitioned databases > >> alternative idea: what if the proxy would add / generate a filter by >> looking at the data? >> a quick idea would be that once you split you add a simple directive >> such as "FILTER GENERATOR $1" or so to the PL/proxy code. >> it would then behind the scene arrange the filter passed on. >> what do you think? > > Hmm. I'm not sure I understand what you are trying to say. Can you > elaborate a little ? > >> >> regards, >> >> hans >> >> >> >> On Sep 1, 2011, at 10:13 AM, Hannu Krosing wrote: >> >>> Hallow hackers >>> >>> I have the following problem to solve and would like to get advice on >>> the best way to do it. >>> >>> The problem: >>> >>> When growing a pl/proxy based database cluster, one of the main >>> operations is splitting a partition. The standard flow is as follows: >>> >>> 1) make a copy of the partitions table(s) to another database >>> 2) reconfigure pl/proxy to use 2 partitions instead of one >>> >>> The easy part is making a copy of all or half of the table to another >>> database. The hard part is fast deletion (i mean milliseconds, >>> comparable to TRUNCATE) the data that should not be in a partition (so >>> that RUN ON ALL functions will continue to return right results). >>> >>> It would be relatively easy, if we still had the RULES for select >>> available for plain tables, but even then the eventual cleanup would >>> usually mean at least 3 passes of disk writes (set xmax, write deleted >>> flag, vacuum and remove) >>> >>> What I would like to have is possibility for additional visibility >>> checks, which would run some simple C function
Re: [HACKERS] help with plug-in function for additional (partition/shard) visibility checks
hello … i have been thinking about this issue for quite a while ... given your idea i am not sure how this can work at all. consider: begin; insert 1 insert 2 commit assume this ends up in the same node, now you split it into two … 1 and 2 will have exactly the same visibility to and transaction. i am not sure how you can get this right without looking at the data. alternative idea: what if the proxy would add / generate a filter by looking at the data? a quick idea would be that once you split you add a simple directive such as "FILTER GENERATOR $1" or so to the PL/proxy code. it would then behind the scene arrange the filter passed on. what do you think? regards, hans On Sep 1, 2011, at 10:13 AM, Hannu Krosing wrote: > Hallow hackers > > I have the following problem to solve and would like to get advice on > the best way to do it. > > The problem: > > When growing a pl/proxy based database cluster, one of the main > operations is splitting a partition. The standard flow is as follows: > > 1) make a copy of the partitions table(s) to another database > 2) reconfigure pl/proxy to use 2 partitions instead of one > > The easy part is making a copy of all or half of the table to another > database. The hard part is fast deletion (i mean milliseconds, > comparable to TRUNCATE) the data that should not be in a partition (so > that RUN ON ALL functions will continue to return right results). > > It would be relatively easy, if we still had the RULES for select > available for plain tables, but even then the eventual cleanup would > usually mean at least 3 passes of disk writes (set xmax, write deleted > flag, vacuum and remove) > > What I would like to have is possibility for additional visibility > checks, which would run some simple C function over tuple data (usually > hash(fieldval) + and + or ) and return visibility (is in this partition) > as a result. It would be best if this is run at so low level that also > vacuum would use it and can clean up the foreign partition data in one > pass, without doing the delete dance first. > > So finally the QUESTION : > > where in code would be the best place to check for this so that > > 1) both regular queries and VACUUM see it > 2) the tuple data (and not only system fields or just xmin/xmax) would > be available for the function to use > > > -- > --- > Hannu Krosing > PostgreSQL Unlimited Scalability and Performance Consultant > 2ndQuadrant Nordic > PG Admin Book: http://www.2ndQuadrant.com/books/ > > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers > -- Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Caching Python modules
On Aug 17, 2011, at 2:19 PM, Jan Urbański wrote: > On 17/08/11 14:09, PostgreSQL - Hans-Jürgen Schönig wrote: >> CREATE OR REPLACE FUNCTION textprocess.add_to_corpus(lang text, t text) >> RETURNS float4 AS $$ >> >>from SecondCorpus import SecondCorpus >>from SecondDocument import SecondDocument >> >> i am doing some intense text mining here. >> the problem is: is it possible to cache those imported modules from function >> to function call. >> GD works nicely for variables but can this actually be done with imported >> modules as well? >> the import takes around 95% of the total time so it is definitely something >> which should go away somehow. >> i have checked the docs but i am not more clever now. > > After a module is imported in a backend, it stays in the interpreter's > sys.modules dictionary and importing it again will not cause the module > Python code to be executed. > > As long as you are using the same backend you should be able to call > add_to_corpus repeatedly and the import statements should take a long > time only the first time you call them. > > This simple test demonstrates it: > > $ cat /tmp/slow.py > import time > time.sleep(5) > > $ PYTHONPATH=/tmp/ bin/postgres -p 5433 -D data/ > LOG: database system was shut down at 2011-08-17 14:16:18 CEST > LOG: database system is ready to accept connections > > $ bin/psql -p 5433 postgres > Timing is on. > psql (9.2devel) > Type "help" for help. > > postgres=# select slow(); > slow > -- > > (1 row) > > Time: 5032.835 ms > postgres=# select slow(); > slow > -- > > (1 row) > > Time: 1.051 ms > > Cheers, > Jan hello jan … the code is actually like this … the first function is called once per backend. it compiles some fairly fat in memory stuff … this takes around 2 secs or so … but this is fine and not an issue. -- setup the environment CREATE OR REPLACE FUNCTION textprocess.setup_sentiment(pypath text, lang text) RETURNS void AS $$ import sys sys.path.append(pypath) sys.path.append(pypath + "/external") from SecondCorpus import SecondCorpus import const GD['path_to_classes'] = pypath GD['corpus'] = SecondCorpus(lang) GD['lang'] = lang return; $$ LANGUAGE 'plpythonu' STABLE; this is called more frequently ... -- add a document to the corpus CREATE OR REPLACE FUNCTION textprocess.add_to_corpus(lang text, t text) RETURNS float4 AS $$ from SecondCorpus import SecondCorpus from SecondDocument import SecondDocument doc1 = SecondDocument(GD['corpus'].senti_provider, lang, t) doc1.create_sentences() GD['corpus'].add_document(doc1) GD['corpus'].process() return doc1.total_score $$ LANGUAGE 'plpythonu' STABLE; the point here actually is: if i use the classes in a normal python command line program this routine does not look like an issue creating the document object and doing the magic in there is not a problem actually … on the SQL side this is already fairly heavy for some reason ... funcid | schemaname |funcname | calls | total_time | self_time | ?column? +-+-+---++---+-- 235287 | textprocess | setup_sentiment |54 | 100166 |100166 | 1854 235288 | textprocess | add_to_corpus | 996 | 438909 |438909 | 440 looks like some afternoon with some more low level tools :(. many thanks, hans -- Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Caching Python modules
hello … i have just fallen over a nasty problem (maybe missing feature) with PL/Pythonu … consider: -- add a document to the corpus CREATE OR REPLACE FUNCTION textprocess.add_to_corpus(lang text, t text) RETURNS float4 AS $$ from SecondCorpus import SecondCorpus from SecondDocument import SecondDocument i am doing some intense text mining here. the problem is: is it possible to cache those imported modules from function to function call. GD works nicely for variables but can this actually be done with imported modules as well? the import takes around 95% of the total time so it is definitely something which should go away somehow. i have checked the docs but i am not more clever now. many thanks, hans -- Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] synchronized snapshots
On Aug 15, 2011, at 9:40 AM, Simon Riggs wrote: > On Mon, Aug 15, 2011 at 2:31 AM, Joachim Wieland wrote: > >> In short, this is how it works: >> >> SELECT pg_export_snapshot(); >> pg_export_snapshot >> >> 03A1-1 >> (1 row) >> >> >> (and then in a different session) >> >> BEGIN TRANSACTION ISOLATION LEVEL REPEATABLE READ (SNAPSHOT = '03A1-1'); > > I don't see the need to change the BEGIN command, which is SQL > Standard. We don't normally do that. > > If we have pg_export_snapshot() why not pg_import_snapshot() as well? > > -- > Simon Riggs http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Training & Services i would definitely argue for a syntax like the one proposed by Joachim.. i could stay the same if this is turned into some sort of flashback implementation some day. regards, hans -- Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] index-only scans
On Aug 12, 2011, at 10:03 PM, Heikki Linnakangas wrote: > On 11.08.2011 23:06, Robert Haas wrote: >> Comments, testing, review appreciated... > > I would've expected this to use an index-only scan: > > postgres=# CREATE TABLE foo AS SELECT generate_series(1,10) AS id; > SELECT 10 > postgres=# CREATE INDEX i_foo ON foo (id) WHERE id = 10; > CREATE INDEX > postgres=# VACUUM ANALYZE foo; > VACUUM > postgres=# EXPLAIN SELECT id FROM foo WHERE id = 10; > QUERY PLAN > - > Index Scan using i_foo on foo (cost=0.00..8.27 rows=1 width=4) > Index Cond: (id = 10) > (2 rows) > > If it's not a predicate index, then it works: > > postgres=# DROP INDEX i_foo; > DROP INDEX > postgres=# EXPLAIN SELECT id FROM foo WHERE id = 10; > QUERY PLAN > --- > Index Only Scan using i_foo2 on foo (cost=0.00..8.28 rows=1 width=4) > Index Cond: (id = 10) > (2 rows) is there any plan to revise the cost for index only scans compared to what it is now? many thanks, hans -- Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Will switchover still need a checkpoint in 9.1 SR Hot Standby
On Aug 7, 2011, at 11:01 AM, Simon Riggs wrote: > On Sun, Aug 7, 2011 at 8:57 AM, Hannu Krosing wrote: >> In 9.0 (as in earlier versions) a former standby host has to do a full >> checkpoint before becoming available as an independent database instance >> in either switchover or failover scenarios. >> >> For most combinations of of bigger than minimal shared buffers and >> non-memory-speed disks this can take from several seconds to tens of >> minutes on busy systems. > > For switchover, you issue a checkpoint first, to reduce this time as > much as possible. > >> Is the pre-activation checkpoint still required in 9.1 ? > > Yes, but I've found a way to remove them in 9.2 and will be patching that > soon. hi simon, this is highly interesting. this is am important issue for big iron. can you share the idea you have in mind? many thanks, hans -- Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SYNONYMS (again)
On Jun 23, 2011, at 12:52 AM, Alvaro Herrera wrote: > Excerpts from Joshua D. Drake's message of mié jun 22 15:37:17 -0400 2011: >> Per: >> >> http://archives.postgresql.org/pgsql-hackers/2010-11/msg02043.php >> >> It seems we did come up with a use case in the procpid discussion. The >> ability to change the names of columns/databases etc, to handle the >> fixing of bad decision decisions during development over time. >> >> Thoughts? > > Let's start with what was discussed and supported in that thread, that > is, databases. It seems less clear that columns are widely believed to > be a good idea to have synonyms for. Besides, synonyms for databases > should be reasonably simple to implement, which is not something I would > say for columns. sorry, i missed the links: http://archives.postgresql.org/pgsql-patches/2006-03/msg00085.php many thanks, hans -- Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SYNONYMS (again)
On Jun 23, 2011, at 12:52 AM, Alvaro Herrera wrote: > Excerpts from Joshua D. Drake's message of mié jun 22 15:37:17 -0400 2011: >> Per: >> >> http://archives.postgresql.org/pgsql-hackers/2010-11/msg02043.php >> >> It seems we did come up with a use case in the procpid discussion. The >> ability to change the names of columns/databases etc, to handle the >> fixing of bad decision decisions during development over time. >> >> Thoughts? > > Let's start with what was discussed and supported in that thread, that > is, databases. It seems less clear that columns are widely believed to > be a good idea to have synonyms for. Besides, synonyms for databases > should be reasonably simple to implement, which is not something I would > say for columns. yes, implementing synonyms is not too hard. some time ago (3 or 4 years ago most likely) we already posted a patch providing support for synonyms. it was rejected because synonyms were said to be a bad design pattern which app developers to do nasty things. so, if you want to work on it maybe this patch is the place to start. many thanks, hans -- Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] bad posix_fadvise support causes initdb to exit ungracefully
hello ... 2.4? we know that some versions of 2.4 cause problems due to broken posix_fadvise. if i remember correctly we built some configure magic into PostgreSQL to check for this bug. what does this check do? many thanks, hans On Jun 15, 2011, at 6:12 PM, Merlin Moncure wrote: > Due to unfortunate environmental conditions (don't ask) I've been > trying to get postgres 9.0 up and running on a fairly ancient linux -- > redhat EL 3 which as kernel 2.4.21. initdb borks on the create > database step with the error message "child process exited with error > code 139". A bit of tracing revealed the exit was happening at the > pg_flush_data which basically wraps posix_fadvise. Disabling fadvise > support in pg_config_manual.h fixed the problem. > > Things brings up a couple of questions: > *) Are linuxes this old out of support? > *) Should configure be testing for working posix_fadvise? > > merlin > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers > -- Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] workaround for expensive KNN?
price has a problem :(. "iphone" can be a 20 cents bag or a sticker or a 900 euro thing signed by whoever ... so, words and the sort-number / price are not related in anyway. price is in this case no way to narrow down the problem (e.g. evaluate first or so). many thanks, hans On Apr 8, 2011, at 5:25 PM, Oleg Bartunov wrote: > Hans, > > what if you create index (price,title) ? > > > On Fri, 8 Apr 2011, PostgreSQL - Hans-J?rgen Sch?nig wrote: > >> hello ... >> >> i got that one ... >> >> "idx_product_t_product_titleprice" gist (to_tsvector('german'::regconfig, >> title), int_price) >> >> so, i have a combined index on text + number. >> to me the plan seems fine ... it looks like a prober KNN traversal. >> the difference between my plan and your plan seems to be the fact that i >> have, say, 1 mio rows which have "handy" or so in it (1 mio out of 11 mio or >> so). you are moving out from one specific place. >> >> my maths is like that: >> 11 mio in total >> 1 mio matching "iphone" >> cheapest / most expensive 10 out of this mio needed. >> >> operator classes are all nice and in place: >> >> SELECT 10 <-> 4 as distance; >> distance >> -- >> 6 >> (1 row) >> >> what does "buffers true" in your case say? >> >> many thanks, >> >> hans >> >> >> On Apr 8, 2011, at 3:22 PM, Oleg Bartunov wrote: >> >>> Probably, you miss two-columnt index. From my early post: >>> http://www.sai.msu.su/~megera/wiki/knngist >>> >>> =# CREATE INDEX spots_idx ON spots USING knngist (coordinates, >>> to_tsvector('french',address)); >>> =# SELECT id, address, (coordinates <-> >>> '(2.29470491409302,48.858263472125)'::point) AS dist FROM spots WHERE >>> coordinates >< '(2.29470491409302,48.858263472125)'::point AND >>> to_tsvector('french',address) @@ to_tsquery('french','mars') LIMIT 10; >>> id| address | >>>dist >>> -+-+- >>> 366096 | 1st Floor Tour Eiffel | Champs de Mars, Paris 75007, France | >>> 2.32488941293945e-05 >>> 4356328 | r Champ de Mars 75007 PARIS | >>> 0.00421854756964406 >>> 5200167 | Champ De Mars 75007 Paris | >>> 0.00453564562587288 >>> 9301676 | Champ de Mars, 75007 Paris, | >>> 0.00453564562587288 >>> 2152213 | 16, ave Rapp, Champ de Mars, Tour Eiffel, Paris, France | >>> 0.00624152097590896 >>> 1923818 | Champ de Mars Paris, France | >>> 0.00838214733539654 >>> 5165953 | 39 Rue Champ De Mars Paris, France | >>> 0.00874410234569529 >>> 7395870 | 39 Rue Champ De Mars Paris, France | >>> 0.00874410234569529 >>> 4358671 | 32 Rue Champ De Mars Paris, France | >>> 0.00876089659276339 >>> 1923742 | 12 rue du Champ de Mars Paris, France | >>> 0.00876764731845995 >>> (10 rows) >>> >>> Time: 7.859 ms >>> >>> =# EXPLAIN (COSTS OFF) SELECT id, address FROM spots WHERE coordinates >< >>> '(2.29470491409302,48.858263472125)'::point >>> AND to_tsvector('french',address) @@ to_tsquery('french','mars') LIMIT 10; >>> >>> QUERY PLAN >>> -- >>> Limit >>> -> Index Scan using spots_idx on spots >>>Index Cond: ((coordinates >< >>> '(2.29470491409302,48.858263472125)'::point) AND >>> (to_tsvector('french'::regconfig, address) @@ '''mar'''::tsquery)) >>> (3 rows) >>> >>> >>> On Fri, 8 Apr 2011, PostgreSQL - Hans-J?rgen Sch?nig wrote: >>> hello all ... given oleg's posting before i also wanted to fire up some KNN related question. let us consider a simple example. i got some million lines and i want all rows matching a tsquery sorted by price. i did some tests: test=# explain (analyze true, buffers true, costs true) SELECT id FROM product.t_product WHERE to_tsvector('german', title) @@ to_tsquery('german', 'iphone') ORDER BY int_price <-> 0 LIMIT 10; QUERY PLAN - -- Limit (cost=0.00..41.11 rows=10 width=16) (actual time=36391.717..45542.590 rows=10 loops=1) Buffers: shared hit=9 read=5004 -> Index Scan using idx_product_t_product_titleprice on t_product (cost=0.00..13251.91 rows=3224 width=16) (actual time= 36391.715..45542.573 rows=10 loops=1) Index Cond: (to_tsvector('german'::regconfig, title) @@ '''iphon'''::
Re: [HACKERS] workaround for expensive KNN?
hello ... i got that one ... "idx_product_t_product_titleprice" gist (to_tsvector('german'::regconfig, title), int_price) so, i have a combined index on text + number. to me the plan seems fine ... it looks like a prober KNN traversal. the difference between my plan and your plan seems to be the fact that i have, say, 1 mio rows which have "handy" or so in it (1 mio out of 11 mio or so). you are moving out from one specific place. my maths is like that: 11 mio in total 1 mio matching "iphone" cheapest / most expensive 10 out of this mio needed. operator classes are all nice and in place: SELECT 10 <-> 4 as distance; distance -- 6 (1 row) what does "buffers true" in your case say? many thanks, hans On Apr 8, 2011, at 3:22 PM, Oleg Bartunov wrote: > Probably, you miss two-columnt index. From my early post: > http://www.sai.msu.su/~megera/wiki/knngist > > =# CREATE INDEX spots_idx ON spots USING knngist (coordinates, > to_tsvector('french',address)); > =# SELECT id, address, (coordinates <-> > '(2.29470491409302,48.858263472125)'::point) AS dist FROM spots WHERE > coordinates >< '(2.29470491409302,48.858263472125)'::point AND > to_tsvector('french',address) @@ to_tsquery('french','mars') LIMIT 10; > id| address | > dist > -+-+- > 366096 | 1st Floor Tour Eiffel | Champs de Mars, Paris 75007, France | > 2.32488941293945e-05 > 4356328 | r Champ de Mars 75007 PARIS | > 0.00421854756964406 > 5200167 | Champ De Mars 75007 Paris | > 0.00453564562587288 > 9301676 | Champ de Mars, 75007 Paris, | > 0.00453564562587288 > 2152213 | 16, ave Rapp, Champ de Mars, Tour Eiffel, Paris, France | > 0.00624152097590896 > 1923818 | Champ de Mars Paris, France | > 0.00838214733539654 > 5165953 | 39 Rue Champ De Mars Paris, France | > 0.00874410234569529 > 7395870 | 39 Rue Champ De Mars Paris, France | > 0.00874410234569529 > 4358671 | 32 Rue Champ De Mars Paris, France | > 0.00876089659276339 > 1923742 | 12 rue du Champ de Mars Paris, France | > 0.00876764731845995 > (10 rows) > > Time: 7.859 ms > > =# EXPLAIN (COSTS OFF) SELECT id, address FROM spots WHERE coordinates >< > '(2.29470491409302,48.858263472125)'::point > AND to_tsvector('french',address) @@ to_tsquery('french','mars') LIMIT 10; > >QUERY PLAN > -- > Limit > -> Index Scan using spots_idx on spots > Index Cond: ((coordinates >< > '(2.29470491409302,48.858263472125)'::point) AND > (to_tsvector('french'::regconfig, address) @@ '''mar'''::tsquery)) > (3 rows) > > > On Fri, 8 Apr 2011, PostgreSQL - Hans-J?rgen Sch?nig wrote: > >> hello all ... >> >> given oleg's posting before i also wanted to fire up some KNN related >> question. >> let us consider a simple example. i got some million lines and i want all >> rows matching a tsquery sorted by price. >> i did some tests: >> >> test=# explain (analyze true, buffers true, costs true) SELECT id FROM >> product.t_product WHERE to_tsvector('german', title) @@ to_tsquery('german', >> 'iphone') ORDER BY int_price <-> 0 LIMIT 10; >> >> QUERY PLAN >> >> - >> -- >> Limit (cost=0.00..41.11 rows=10 width=16) (actual time=36391.717..45542.590 >> rows=10 loops=1) >> Buffers: shared hit=9 read=5004 >> -> Index Scan using idx_product_t_product_titleprice on t_product >> (cost=0.00..13251.91 rows=3224 width=16) (actual time= >> 36391.715..45542.573 rows=10 loops=1) >>Index Cond: (to_tsvector('german'::regconfig, title) @@ >> '''iphon'''::tsquery) >>Order By: (int_price <-> 0::bigint) >>Buffers: shared hit=9 read=5004 >> Total runtime: 45542.676 ms >> (7 rows) >> >> test=# explain (analyze true, buffers true, costs true) SELECT id FROM >> product.t_product WHERE to_tsvector('german', title) @@ to_tsquery('german', >> 'handy') ORDER BY int_price <-> 0 LIMIT 10; >> >> QUERY PLAN >> >> - >> - >> Limit (cost=0.00..41.03 rows=10 width=16) (actual time=7243.526..10935.227 >> rows=10 loops=1)
[HACKERS] workaround for expensive KNN?
hello all ... given oleg's posting before i also wanted to fire up some KNN related question. let us consider a simple example. i got some million lines and i want all rows matching a tsquery sorted by price. i did some tests: test=# explain (analyze true, buffers true, costs true) SELECT id FROM product.t_product WHERE to_tsvector('german', title) @@ to_tsquery('german', 'iphone') ORDER BY int_price <-> 0 LIMIT 10; QUERY PLAN - -- Limit (cost=0.00..41.11 rows=10 width=16) (actual time=36391.717..45542.590 rows=10 loops=1) Buffers: shared hit=9 read=5004 -> Index Scan using idx_product_t_product_titleprice on t_product (cost=0.00..13251.91 rows=3224 width=16) (actual time= 36391.715..45542.573 rows=10 loops=1) Index Cond: (to_tsvector('german'::regconfig, title) @@ '''iphon'''::tsquery) Order By: (int_price <-> 0::bigint) Buffers: shared hit=9 read=5004 Total runtime: 45542.676 ms (7 rows) test=# explain (analyze true, buffers true, costs true) SELECT id FROM product.t_product WHERE to_tsvector('german', title) @@ to_tsquery('german', 'handy') ORDER BY int_price <-> 0 LIMIT 10; QUERY PLAN - - Limit (cost=0.00..41.03 rows=10 width=16) (actual time=7243.526..10935.227 rows=10 loops=1) Buffers: shared hit=3 read=2316 -> Index Scan using idx_product_t_product_titleprice on t_product (cost=0.00..29762.61 rows=7255 width=16) (actual time= 7243.524..10935.217 rows=10 loops=1) Index Cond: (to_tsvector('german'::regconfig, title) @@ '''handy'''::tsquery) Order By: (int_price <-> 0::bigint) Buffers: shared hit=3 read=2316 Total runtime: 10935.265 ms (7 rows) test=# explain (analyze true, buffers true, costs true) SELECT id FROM product.t_product WHERE to_tsvector('german', title) @@ to_tsquery('german', 'handy') ORDER BY int_price <-> 0 LIMIT 1; QUERY PLAN - --- Limit (cost=0.00..4.10 rows=1 width=16) (actual time=28.527..28.528 rows=1 loops=1) Buffers: shared hit=1 read=1577 -> Index Scan using idx_product_t_product_titleprice on t_product (cost=0.00..29762.61 rows=7255 width=16) (actual time= 28.525..28.525 rows=1 loops=1) Index Cond: (to_tsvector('german'::regconfig, title) @@ '''handy'''::tsquery) Order By: (int_price <-> 0::bigint) Buffers: shared hit=1 read=1577 Total runtime: 28.558 ms (7 rows) under any circumstances - there is no way to reduce the number of buffers needed for a query like that. if everything is cached this is still ok but as soon as you have to take a single block from disk you will die a painful random I/O death. is there any alternative which does not simply die when i try to achieve what i want? the use case is quite simple: all products with a certain word (10 cheapest or so). is there any alternative approach to this? i was putting some hope into KNN but it seems it needs too much random I/O :(. many thanks, hans -- Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] k-neighbourhood search in databases
hello ... i have put some research into that some time ago and as far as i have seen there is a 99% chance that no other database can do it the way we do it. it seems nobody comes even close to it (especially not in the flexibility-arena). oracle: disgusting workaround ... http://www.orafaq.com/usenet/comp.databases.oracle.misc/2005/11/03/0083.htm db2: disgusting workaround (no server side code it seems) sybase: disgusting workaround (no serverside code it seems) microsoft: there seems to be something coming out (or just out) but i have not seen anything working yet. regards, hans On Apr 8, 2011, at 2:21 PM, Oleg Bartunov wrote: > Hi there, > > I'm interesting if other databases provides built-in effective knn search ? > Google didn't help me. > > Regards, > Oleg > _ > Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), > Sternberg Astronomical Institute, Moscow University, Russia > Internet: o...@sai.msu.su, http://www.sai.msu.su/~megera/ > phone: +007(495)939-16-83, +007(495)939-23-83 > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers > -- Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: cross column correlation ...
>>> >> >> Still, having more data a user can probe would be nice. >> >> I wonder why everyone avoids Microsoft's approach to the subject. >> Apparently, they go in the 'auto-tune as much as possible' direction. >> And tests we did a while ago, involving asking team from Microsoft and a >> team from oracle to optimise set of queries for the same set of data >> (bookies data, loads of it) showed that the auto-tuning Microsoft has in >> their >> sql server performed much better than a team of over-sweating oracle dba's. > > I don't think *anyone* is avoiding that approach. There is almost > universal consensus here that auto-tuning is better than manual > tuning, even to the extent of being unwilling to add knobs to allow > manual tuning of settings we have no idea how to auto-tune and no > plans to auto-tune. > >> In my current work place/camp we have many deployments of the same system, >> over different types of machines, each with different customer data that >> vary so much that queries need to be rather generic. >> Postgresql shows its strength with planner doing a good job for different >> variants of data, however we do a very little tweaking to the configuration >> parameters. Just because it is just too hard to overlook all of them. >> I guess that the systems could behave much better, but no one is going to >> tweak settings for 50 different installations over 50 different type of data >> and 50 different sets of hardware. >> If there was even a tiny amount of automation provided in the postgresql, I >> would welcome it with open arms. > > What do you have in mind? > what we are trying to do is to explicitly store column correlations. so, a histogram for (a, b) correlation and so on. the planner code then goes through its restrictions in the query and finds the best / longest combination it can find and which has some statistics defined. it seems we can also do this for join selectivity and expressions. the planner code for "raw column correlation" without expression ( cos(id) or so) and joins is there (WIP, no ANALYZE support and so on so far). i think auto tuning is a good thing to have and the door to actually do it is wide open with our approach. all it takes is a mechanism to see which "conditions" are used how often and somebody could write a job which automatically tells the system which stats to collect / sample. i think for an "average" user this is the most simplistic thing then. but, to get there we have to get the bloody sampling and the rest of the planner code right in the first place. auto tuning in this area is still something which is far in the future - but at least the road to it is clear. some people suggested some approach dealing with effective_cache_size and so on ... there are many good approaches here but they don't address the actual problem of wrong size-estimates. many thanks, hans -- Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: cross column correlation ...
On Feb 24, 2011, at 2:09 AM, Josh Berkus wrote: > >> Personally, I think the first thing we ought to do is add a real, bona >> fide planner hint to override the selectivity calculation manually, >> maybe something like this: >> >> WHERE (x < 5 AND y = 1) SELECTIVITY (0.1); >> >> Then, having provided a method for the DBA to extinguish the raging >> flames of searing agony which are consuming them while a crocodile >> chews off their leg and their boss asks them why they didn't use >> Oracle, we can continue bikeshedding about the best way of fixing this >> problem in a more user-transparent fashion. > > Is there some way we can do that without adding the selectivity hint to > the query itself? That's the biggest issue with hints. > well, you could hide this hint in the system table - say; instead of decorating the query you could store the decoration in some system relation ... but, if you get it right, you call this decoration histogram ;). i think the patch with a multi-dim histogram is good (i have seen something similar for PostGIS). what is still needed in our patch is a.) multi-dim sampling (no idea how to get it right) and b.) investigating how to deal with joins and expressions (e.g. cos(id) ). hints into the right direction are highly welcome. many thanks, hans -- Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: cross column correlation ...
On Feb 23, 2011, at 3:46 PM, Robert Haas wrote: > On Wed, Feb 23, 2011 at 12:50 AM, Nathan Boley wrote: >>> Personally, I think the first thing we ought to do is add a real, bona >>> fide planner hint to override the selectivity calculation manually, >>> maybe something like this: >>> >>> WHERE (x < 5 AND y = 1) SELECTIVITY (0.1); >> >> If you're going to go that far, why not just collect statistics on >> that specific predicate? >> >> ie, ANALYZE SELECTIVITY ON tablename (x, y) WHERE (x < 5 AND y = 1); >> >> Then it won't fall subject to all of the pitfalls that Tom outlines below. >> >> Selectivities are easy to estimate if we know the predicate. They only >> become hard when they have to work for every possible predicate. > > Fair point. > > -- > Robert Haas basically we got the idea of allowing "expressions" in cross column stuff. i think this can be very useful. it would fix the problem of a query like that: SELECT * FROM table WHERE cos(field) = some_number; this takes a constant fraction of the table which is usually plain wrong as well (and the error tends to multiply inside the plan). i am just not sure if i have understood all corner cases of that already. ultimate goal: get it right for join estimates (this is why a syntax extension is definitely needed - you cannot track all of them automatically). many thanks, hans -- Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: cross column correlation ...
>>> >> >> cheapest and easiest solution if you run into this: add "fake" functions >> which the planner cannot estimate properly. >> use OR to artificially prop up estimates or use AND to artificially lower >> them. there is actually no need to redesign the schema to get around it but >> it is such an ugly solution that it does not even deserve to be called >> "ugly" ... >> however, fast and reliable way to get around it. > > We couldn't possibly design a hint mechanism that would be uglier or > less future-proof than this workaround (which, by the way, I'll keep > in mind for the next time I get bitten by this). > > -- > Robert Haas > EnterpriseDB: http://www.enterprisedb.com > The Enterprise PostgreSQL Company > i think the main issue is: what we do is ugly because of despair and a lack of alternative ... what you proposed is ugly by design ;). overall: the workaround will win the ugliness contest, however ;). many thanks, hans -- Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: cross column correlation ...
> Those are real problems, but I still want it. The last time I hit > this problem I spent two days redesigning my schema and adding > triggers all over the place to make things work. If I had been > dealing with a 30TB database instead of a 300MB database I would have > been royally up a creek. > > To put that another way, it's true that some people can't adjust their > queries, but also some people can. It's true that nonstandard stuff > sucks, but queries that don't work suck, too. And as for better > solutions, how many major release cycles do we expect people to wait > for them? Even one major release cycle is an eternity when you're > trying to get the application working before your company runs out of > money, and this particular problem has had a lot of cycles expended on > it without producing anything very tangible (proposed patch, which > like you I can't spare a lot of cycles to look at just now, possibly > excepted). cheapest and easiest solution if you run into this: add "fake" functions which the planner cannot estimate properly. use OR to artificially prop up estimates or use AND to artificially lower them. there is actually no need to redesign the schema to get around it but it is such an ugly solution that it does not even deserve to be called "ugly" ... however, fast and reliable way to get around it. regards, hans -- Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: cross column correlation ...
On Feb 23, 2011, at 2:58 AM, Robert Haas wrote: > 2011/2/22 PostgreSQL - Hans-Jürgen Schönig : >> how does it work? we try to find suitable statistics for an arbitrary length >> list of conditions so that the planner can use it directly rather than >> multiplying all the selectivities. this should make estimates a lot more >> precise. >> the current approach can be extended to work with expressions and well as >> "straight" conditions. > > /me prepares to go down in flames. > > Personally, I think the first thing we ought to do is add a real, bona > fide planner hint to override the selectivity calculation manually, > maybe something like this: > > WHERE (x < 5 AND y = 1) SELECTIVITY (0.1); i thought there was an agreement that we don't want planner hints? as tom pointed out - many broken queries come out of some query generator where even the design to make the design is broken by design. personally i like query generators as long as other people use them ... telling people that this is the wrong way to go is actually financing my holiday next week ... ;). in general - hibernate and stuff like that is a no-go. personally i like the type of planner hints oleg and teodor came up with - i think we should do more of those hooks they are using but hiding it in some syntax is not a good idea. it does not change the query and it still gives a lot of room to toy around. it looks like a compromise. however, oleg's contrib module does not fix the core problem of cross column statistics because a hint is usually static but you want flexible selectivity. regards, hans -- Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] WIP: cross column correlation ...
hello everbody, we have spent some time in finally attacking cross column correlation. as this is an issue which keeps bugging us for a couple of applications (some years). this is a WIP patch which can do: special cross column correlation specific syntax: CREATE CROSS COLUMN STATISTICS ON tablename (field, ...); DROP CROSS COLUMN STATISTICS ON tablename (field, ...); we use specific syntax because we simply cannot keep track of all possible correlations in the DB so the admi can take care of things explicitly. some distant day somebody might want to write a mechanism to derive the desired stats automatically but this is beyond the scope of our project for now. as far as the patch is concerned: it is patched nicely into clauselist_selectivity(), but has some rough edges, even when a cross-col stat is found, the single col selectivities are still counted ( = lovering the selectivity even more), this is a TODO. this patch adds the grammar and the start of planner integration with a static selectivity value for now, the previous discussion about cross-column statistics can be continued and perhaps comes to fruition soon. how does it work? we try to find suitable statistics for an arbitrary length list of conditions so that the planner can use it directly rather than multiplying all the selectivities. this should make estimates a lot more precise. the current approach can be extended to work with expressions and well as "straight" conditions. goal: to make cross column correlation work for 9.2 ... the purpose of this mail is mostly to get the race for a patch going and to see if the approach as such is reasonable / feasible. many thanks, hans cross-column-v5.patch Description: Binary data -- Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] SELECT ... WHERE fti_query ORDER BY numeric_col LIMIT x - problem
I have toyed around with KNN a little and I am pretty impressed when it comes to the results we have seen in the GIS world. Given the infrastructure we have at the moment I wonder if KNN can help to speedup queries like that: SELECT ... WHERE fti_query ORDER BY numeric_col LIMIT x The use case is fairly simple: Give me all products matching a certain tsquery and order those products by price or so to show the top 10. KNN is supposed to gives sorted output which works perfectly for points and so on but can there theoretically be a sensible / reliable distance function for a (tsvector / numeric) combination? Some first testing gave me some "interesting" output . If there is no way of defining a reasonable distance function performance is screwed up if fti_query returns a large list (it requires a complete sort of all prices then). does anybody have some useful input here? many thanks, hans -- Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plan time of MASSIVE partitioning ...
On Sep 3, 2010, at 4:40 PM, Tom Lane wrote: > =?iso-8859-1?Q?PostgreSQL_-_Hans-J=FCrgen_Sch=F6nig?= > writes: >> imagine a system with, say, 1000 partitions (heavily indexed) or so. the >> time taken by the planner is already fairly heavy in this case. > > As the fine manual points out, the current scheme for managing > partitioned tables isn't intended to scale past a few dozen partitions. > > I think we'll be able to do better when we have an explicit > representation of partitioning, since then the planner won't > have to expend large amounts of effort reverse-engineering knowledge > about how an inheritance tree is partitioned. Before that happens, > it's not really worth the trouble to worry about such cases. > > regards, tom lane > thank you ... - the manual is clear here but we wanted to see if there is some reasonably low hanging fruit to get around this. it is no solution but at least a clear statement ... many thanks, hans -- Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plan time of MASSIVE partitioning ...
On Sep 3, 2010, at 2:04 PM, Stephen Frost wrote: > * PostgreSQL - Hans-Jürgen Schönig (postg...@cybertec.at) wrote: >> did anybody think of a solution to this problem. >> or more precisely: can there be a solution to this problem? > > Please post to the correct list (-performance) and provide information > like PG version, postgresql.conf, the actual table definition, the > resulting query plan, etc, etc... > > Thanks, > > Stephen hello stephen, this seems like more a developer question to me than a pre performance one. it is not related to the table structure at all - it is basically an issue with incredibly large inheritance lists. it applies to postgres 9 and most likely to everything before. postgresql.conf is not relevant at all at this point. the plan is pretty fine. the question is rather: does anybody see a chance to handle such lists more efficiently inside postgres? also, it is not the point if my data structure is sane or not. it is really more generic - namely a shortcut for this case inside the planing process. many thanks, hans -- Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] plan time of MASSIVE partitioning ...
hello everybody, we came across an issue which turned out to be more serious than previously expected. imagine a system with, say, 1000 partitions (heavily indexed) or so. the time taken by the planner is already fairly heavy in this case. i tried this one with 5000 unindexed tables (just one col): test=# \timing Timing is on. test=# prepare x(int4) AS select * from t_data order by id desc; PREPARE Time: 361.552 ms you will see similar or higher runtimes in case of 500 partitions and a handful of indexes. does anybody see a potential way to do a shortcut through the planner? a prepared query is no solution here as constraint exclusion would be dead in this case (making the entire thing an even bigger drama). did anybody think of a solution to this problem. or more precisely: can there be a solution to this problem? many thanks, hans -- Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Path question
hello tom, yeah, we have followed quite a lot of discussion as well ... and yes, no patches. as far as this problem is concerned: we are working on a patch and did some prototyping inside the planner already (attached). the code we have is pretty limited atm (such as checking for a sort clause explicitly and so on - it has no support for windowing related optimizations and so on so far). the cost model is not our problem - it is a lot easier to read than the code we are fighting with (it is a different level of complexity). i think costs can be handled. yes, this merging adds some costs for sure. we might see a hell amount of operators being called - but i think given a reasonable number of partitions it is still a lot cheaper than actually resorting ... and, it is a lot more space efficient. in my practical case i cannot resort because we would simply run out of space. an index scan is expensive but needs no additional sort space ... and, merge is O(n) which sort is clearly not. advise is highly appreciated. many thanks, hans push-down-sort-into-inh-2.patch Description: Binary data On Sep 1, 2010, at 5:00 PM, Tom Lane wrote: > =?iso-8859-1?Q?PostgreSQL_-_Hans-J=FCrgen_Sch=F6nig?= > writes: >> On Sep 1, 2010, at 4:10 PM, Tom Lane wrote: >>> This is really premature, and anything you do along those lines now will >>> probably never get committed. > >> well, why non-overlapping? the idea is to make append smart enough to >> take the sorted lists from below and merge them which will give sorted >> output as well. > > Well, an extra merge step is going to change the cost comparisons quite > a bit; see Greg Starks' comments. But in any case, my point wasn't that > this is something we should never do; it was that it makes more sense to > wait till something has happened with explicit partitioning. > >>> The project direction is that we are going to add some explicit >>> representation of partitioned tables. > >> can you outline some ideas here and maybe point to some useful discussion >> here? > > There's been boatloads of discussion of partitioning, and at least one > submitted patch, over the past year or so ... > > regards, tom lane > -- Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Path question
On Sep 1, 2010, at 4:10 PM, Tom Lane wrote: > Boszormenyi Zoltan writes: >> we are experimenting with modifying table partitioning >> so the ORDER BY clause can be pushed down to >> child nodes on the grounds that: > > This is really premature, and anything you do along those lines now will > probably never get committed. The problem is that the transformation > you propose is wrong unless the planner can prove that the different > child tables contain nonoverlapping ranges of the sort key. Now you > might be intending to add logic to try to prove that from inspection of > constraints, but I don't believe that reverse-engineering such knowledge > on the fly is a sane approach: it will be hugely expensive and will add > that cost even in many situations where the optimization fails to apply. > well, why non-overlapping? the idea is to make append smart enough to take the sorted lists from below and merge them which will give sorted output as well. my original idea was what you described but given Martijn van Oosterhout's posting we were pretty confident that we can get along without non-overlapping partitions. > The project direction is that we are going to add some explicit > representation of partitioned tables. After that, the planner can just > know immediately that a range-partitioned sort key is amenable to this > treatment, and at that point it'll make sense to work on it. > can you outline some ideas here and maybe point to some useful discussion here? many thanks, hans -- Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] How to construct an exact plan
hello ... here is the ultimate revelation of planner hints in postgres ... let us praise oleg and teodor for solving a practical problem for practical people ... http://www.sai.msu.su/~megera/wiki/plantuner try this one ... it is excellent and definitely helpful for many many people out there. unfortunately this code is not too well known. many thanks, hans On Aug 30, 2010, at 9:41 PM, Pavel Stehule wrote: > 2010/8/30 Pei He : >> Hi, >> I am hacking postgresql 8.2.5. a) and b) do not work for me. >> >> The situation is that I made a join operator, and a scan operator. >> And, The join operator requires the scan operator as the inner. So, I >> need to have the full control of the join plan. >> >> I am not ready to provide the optimization support for the two new >> operators. And, I want to run some performance tests before working on >> the optimization part. >> >> So, I want to know if it is possible to directly create a path or a >> plan, and do a unit test for the operators. >> > > yes, it is possible - but it isn't simple. I thing, so better is > simple implementation of all parts and then runtime blocking some (for > you not interesting) buildin methods via SET enable_ to off. > > Regards > > Pavel Stehule > >> >> Thanks >> -- >> Pei >> >> On Mon, Aug 30, 2010 at 1:59 PM, Josh Berkus wrote: >>> I have developed a new operators, and I want to do some tests on it. I do not want the optimizer to choose the plan for me, and I need to construct a plan as exact as I want. Can anyone provide me a way to achieve that? >>> >>> a) easy: choose a simple enough query that its plan is always predictable. >>> >>> b) moderate: choose a query whose plan is predictable if you manipulate >>> the enable_* configuration settings >>> >>> c) hard: hack the PostgreSQL planner to choose a specific execution >>> plan, and recompile Postgres. >>> >>> -- >>> -- Josh Berkus >>> PostgreSQL Experts Inc. >>> http://www.pgexperts.com >>> >> >> -- >> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) >> To make changes to your subscription: >> http://www.postgresql.org/mailpref/pgsql-hackers >> > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers > -- Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] "micro bucket sort" ...
as tom pointed out - this is not possible. there is no limit 20 in my case - i just used it to indicate that limiting does not make the index scan possible which it does in some other cases. the partial sort thing simon pointed out is what is needed at this point. many thanks, hans On Aug 11, 2010, at 5:29 PM, Alvaro Herrera wrote: > Excerpts from Hans-Jürgen Schönig's message of mié ago 11 08:21:10 -0400 2010: > >> same with limit ... >> >> >> test=# explain analyze select * from t_test order by x, y limit 20; > > But if you put the limit in a subquery which is ordered by the > known-indexed condition, it is very fast: > > alvherre=# explain analyze select * from (select * from t_test order by x > limit 20) f order by x, y; > QUERY PLAN > > ─ > Sort (cost=1.24..1.29 rows=20 width=8) (actual time=0.252..0.296 rows=20 > loops=1) > Sort Key: t_test.x, t_test.y > Sort Method: quicksort Memory: 26kB > -> Limit (cost=0.00..0.61 rows=20 width=8) (actual time=0.051..0.181 > rows=20 loops=1) > -> Index Scan using idx_a on t_test (cost=0.00..30408.36 > rows=100 width=8) (actual time=0.046..0.098 rows=20 loops=1) > Total runtime: 0.425 ms > (6 filas) > > > I guess it boils down to being able to sort a smaller result set. > > -- > Álvaro Herrera > The PostgreSQL Company - Command Prompt, Inc. > PostgreSQL Replication, Consulting, Custom Development, 24x7 support > -- Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] non-overlapping, consecutive partitions
On Jul 25, 2010, at 11:56 AM, Martijn van Oosterhout wrote: > On Fri, Jul 23, 2010 at 10:04:00PM +0200, Hans-Jürgen Schönig wrote: >> create table foo ( x date ); >> create table foo_2010 () INHERITS (foo) >> create table foo_2009 () INHERITS (foo) >> create table foo_2008 () INHERITS (foo) >> >> now we add constraints to make sure that data is only in 2008, 2009 and 2010. >> we assume that everything is indexed: >> >> SELECT * FROM foo ORDER BY bar will now demand an ugly sort for this data. >> this is not an option if you need more than a handful of rows ... > > I think the right way to approach this is to teach the planner about > merge sorts. This is, if the planner has path to foo_* all ordered by > the same key (because they have the same indexes) then it has a path to > the UNION of those tables simply by merging the results of those paths. > > This would be fairly straight forward to implement I think, you may > even be able to reuse the merge sort in the normal sort machinery. > (You'll need to watch out for UNION vs UNION ALL.) > > The real advantage of this approach is that you no longer have to prove > anything about the constraints or various datatypes and it is more > general. Say you have partitioned by start_date but you want to sort by > end_date, simple index scanning won't work while a merge sort will work > beautifully. > > You're also not limited to how the partitioning machinery will > eventually work. > > Hope this helps, i think this is excellent input. i will do some research going into that direction. many thanks, hans -- Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] cross column correlation revisted
hello ... look at the syntax i posted in more detail: >> ALTER TABLE x SET CORRELATION STATISTICS FOR (x.id = y.id AND x.id2 = >> y.id2) > it says X and Y ... the selectivity of joins are what i am most interested in. cross correlation of columns within the same table are just a byproduct. the core thing is: how can i estimate the number of rows returned from a join? an example would be: you have a email accounts + messages. you know that each row will match in a join as you can assume that every account will have a message. what we need is a syntax which covers the join case and the case where columns inside the same table correlate. and the fact that an index cannot cover two tables leads me to the conclusion that stats on an index are not the solution to the join problem. many thanks, hans On Jul 14, 2010, at 4:22 PM, Tom Lane wrote: > =?iso-8859-1?Q?PostgreSQL_-_Hans-J=FCrgen_Sch=F6nig?= > writes: >> i think that having stats on an index is a problem by itself for 2 reasons - >> for cross column correlation at least: > >> a.) joins cannot be covered by an index on two tables - we would fix >> "inside a table correlation problems" but not joins. > > Your proposed command didn't cover the two-table case either, and anyway > what the heck do you mean by cross-correlation across tables? > Cross-correlation is about the correlation between values in the same > row. > >> b.) who says that there is actually an index in place? > > If the combination of columns is actually interesting, there might well > be an index in place, or the DBA might be willing to create it. For > that matter, have you considered the idea of examining the index > contents to derive the statistics? Might work better than trying to get > numbers via ANALYZE. > > regards, tom lane > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers > -- Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] cross column correlation revisted
hello tom, i think that having stats on an index is a problem by itself for 2 reasons - for cross column correlation at least: a.) joins cannot be covered by an index on two tables - we would fix "inside a table correlation problems" but not joins. b.) who says that there is actually an index in place? assume you are doing some big seq scan to do analytics. you don't want it to be indexed for 10 different types of queries. i think i is pretty hard to determine automatically what to collect because we cannot know which permutations of cross-column magic people will use. i was thinking along the line of having it automatic as well but i could not figure out how to do it. i think we can suggest addition stats to the user and we can write tools to figure our somehow what would be useful but personally i cannot see anything which is better than a command here. many thanks, hans On Jul 14, 2010, at 4:01 PM, Tom Lane wrote: > Heikki Linnakangas writes: >> On 14/07/10 13:12, PostgreSQL - Hans-Jürgen Schönig wrote: >>> maybe somehow like this ... >>> ALTER TABLE x SET CORRELATION STATISTICS FOR (id = id2 AND id3=id4) > >> +1 is my general feeling, it's good if you can tell the system to >> collect additional statistics where needed. > > The previous discussions about this went in the direction of > "automatically collect stats if there is an index on that combination of > columns". Do we really need a command? > >> However, the problem is how to represent and store the >> cross-correlation. > > Yes, whatever the triggering mechanism is for collecting cross-column > stats, actually doing something useful is the hard part. > > regards, tom lane > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers > -- Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] cross column correlation revisted
On Jul 14, 2010, at 12:40 PM, Heikki Linnakangas wrote: > On 14/07/10 13:12, PostgreSQL - Hans-Jürgen Schönig wrote: >> hello everybody, >> >> we are currently facing some serious issues with cross correlation issue. >> consider: 10% of all people have breast cancer. we have 2 genders (50:50). >> if i select all the men with breast cancer, i will get basically nobody - >> the planner will overestimate the output. >> this is the commonly known problem ... >> >> this cross correlation problem can be quite nasty in many many cases. >> underestimated nested loops can turn joins into a never ending nightmare and >> so on and so on. >> >> my ideas is the following: >> what if we allow users to specifiy cross-column combinations where we keep >> separate stats? >> maybe somehow like this ... >> >> ALTER TABLE x SET CORRELATION STATISTICS FOR (id = id2 AND id3=id4) >> >> or ... >> >> ALTER TABLE x SET CORRELATION STATISTICS FOR (x.id = y.id AND x.id2 = >> y.id2) >> >> clearly we cannot store correlation for all combinations of all columns so >> we somehow have to limit it. >> >> what is the general feeling about something like that? > > +1 is my general feeling, it's good if you can tell the system to collect > additional statistics where needed. And once you have that, you can write an > agent or something to detect automatically which extra statistics might be > useful. > it seems i can leave my bunker where i was hiding for cover when i was waiting for a reply ;). yes, my idea was to have an agent as well - but this is just some follow up problem. > However, the problem is how to represent and store the cross-correlation. For > fields with low cardinality, like "gender" and boolean "breast-cancer-or-not" > you can count the prevalence of all the different combinations, but that > doesn't scale. Another often cited example is zip code + street address. > There's clearly a strong correlation between them, but how do you represent > that? we could play the same story with a table storing people including their home country and the color of their skin. obviously we will have more black people in african countries.. > > For scalar values we currently store a histogram. I suppose we could create a > 2D histogram for two columns, but that doesn't actually help with the zip > code + street address problem. > i think we might go for a second relation here specifically for this issue and a boolean flag in the current stats table indicating that additional correlation stats exist (to avoid an additional lookup unless really necessary). do you have a useful syntax in mind? the thing is: this issue can be isolated inside a table (e.g. WHERE a.id = a.id2 AND a.id3 = a.id4) or it might span two tables with an arbitrary number of fields. many thanks, hans -- Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] cross column correlation revisted
hello everybody, we are currently facing some serious issues with cross correlation issue. consider: 10% of all people have breast cancer. we have 2 genders (50:50). if i select all the men with breast cancer, i will get basically nobody - the planner will overestimate the output. this is the commonly known problem ... this cross correlation problem can be quite nasty in many many cases. underestimated nested loops can turn joins into a never ending nightmare and so on and so on. my ideas is the following: what if we allow users to specifiy cross-column combinations where we keep separate stats? maybe somehow like this ... ALTER TABLE x SET CORRELATION STATISTICS FOR (id = id2 AND id3=id4) or ... ALTER TABLE x SET CORRELATION STATISTICS FOR (x.id = y.id AND x.id2 = y.id2) clearly we cannot store correlation for all combinations of all columns so we somehow have to limit it. what is the general feeling about something like that? many thanks, hans -- Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ECPG FETCH readahead
On Jun 24, 2010, at 2:13 PM, Michael Meskes wrote: >> I think, yes, it does make sense. Because we are talking >> about porting a whole lot of COBOL applications. > > COBOL??? > yes, COBOL :). it is much more common than people think. it is not the first COBOL request for PostgreSQL hitting my desk. in our concrete example we are using a C module written with ECPG which is magically attached to tons of COBOL code ... >> The ESQL/C or ECPG connector was already written >> the Informix quirks in mind, so it fetches only one record >> at a time passing it to the application. And similar performance >> is expected from ECPG - which excpectation is not fulfilled >> currently because libecpg doesn't do the same caching as >> ESQL/C does. > > Eh, you are talking about a program you wrote for your customer or they wrote > themselves, right? I simply refuse to add this stuff only to fix this > situation > for that one customer of yours if it only hits them. Now the thing to discuss > is how common is this situation. > > Michael i think that this cursor issue is a pretty common thing for many codes. people are usually not aware of the fact that network round trips and parsing which are naturally related to "FETCH 1" are a lot more expensive than fetching one row somewhere deep inside the DB engine. out there there are many applications which fetch data row by row. if an app fetches data row by row in PostgreSQL it will be A LOT slower than in, say, Informix because most commercial database clients will cache data inside a cursor behind the scenes to avoid the problem we try to solve. currently we are talking about a performance penalty of factor 5 or so. so - it is not a small thing; it is a big difference. regards, hans -- Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers