Re: [HACKERS] psql feature thought
On May 16, 2006, at 14:17 , Tom Lane wrote: [1] Mostly. There's the infamous continued-string-literal construct... Are you referring to this? est=# SELECT 'foo''bar'; -- invalid ERROR: syntax error at or near "'bar'" at character 17 LINE 1: SELECT 'foo''bar'; ^ test=# SELECT 'foo' test-# 'bar'; -- valid ?column? -- foobar (1 row) I've been exploiting that (mis)feature for line-wrapping comments after seeing it used that way somewhere. Very useful in that case, but very strange, I'll give you that. Michael Glaesemann grzm seespotcode net ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Compression and on-disk sorting
> > Given that any time that happens we end up caring much less about CPU > > usage and much more about disk IO, for any of these cases that use > > non-random access, compressing the data before sending it to disk would > > potentially be a sizeable win. > > Note however that what the code thinks is a spill to disk and what > actually involves disk I/O are two different things. If you think > of it as a spill to kernel disk cache then the attraction is a lot > weaker... Yes, that is very true. However it would also increase the probability that spill to disk is not needed, since more data fits in RAM. It would probably need some sort of plugin architecture, since the fastest compression algorithms (LZO) that also reach good ratios are gpl. LZO is proven to increase physical IO write speed with low CPU overhead. Andreas ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Compression and on-disk sorting
> > Personally, I believe it would be worth it - but only to a few. And > > these most of these few are likely using Oracle. So, no gain unless > > you can convince them to switch back... :-) > > We do know that the benefit for commercial databases that use raw and > file system storage is that raw storage is only a few percentage > points faster. Imho it is really not comparable because they all use direct or async IO that bypasses the OS buffercache even when using filesystem files for storage. A substantial speed difference is allocation of space for restore (no format of fs and no file allocation needed). I am not saying this to advocate moving in that direction however. I do however think that there is substantial headroom in reducing the number of IO calls and reducing on disk storage requirements. Especially in concurrent load scenarios. Andreas ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] [BUGS] BUG #2429: Explain does not report object's schema
On Tue, May 16, 2006 at 09:15:13AM +0530, Gurjeet Singh wrote: >I don't think Cristiano is asking for the schema_name in the > EXPLAIN o/p. The request is for the table ALIASes to be shown in the > o/p, which makes more sense than schema_name+table_name, since the > same table can be used in the same query more than once. As has been pointed out, aliases ave always been displayed. The OPs problem was that he was using schema.tablename everywhere and explain didn't distinguish between schema1.mytable and schema2.mytable. It was suggested that he use aliases instead to make it work. Have a nice day, -- Martijn van Oosterhout http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to > litigate. signature.asc Description: Digital signature
Re: [HACKERS] Creating a Pseudocolumn
I think Jonah is referring to the the 'START WITH ... CONNECT BY' clause feature from Oracle. Am I right Jonah? For such queries, Oracle introduces a pseudocolumn LEVEL, that holds the value of the indentation level of the current rusultant row. In Oracle, the LEVEL column returns 0 for the root node of a hierarchy, 1 for it's children, 2 for their children, and so forth. LEVEL is commonly used to indent hierarchical results. LEVEL might not be a part of the standard, but it is very handy when dealing with hierarchical queries. The chapter 1 (http://www.oreilly.com/catalog/sqlpr/chapter/ch01.pdf) of book 'SQL Pocket Guide' elaborates more on it. Gurjeet. On 5/16/06, Tom Lane <[EMAIL PROTECTED]> wrote: Perhaps you should start by explaining what the heck you're talking about ;-). I can find nothing in the SQL2003 spec that sounds anything like a LEVEL function or pseudocolumn. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[HACKERS] Wrong plan for simple join with index on FK
Hello I test using index on foreign key. I found situation, when planner choose worse plan. create table f1(pk serial primary key); create table f2(fk integer references f1(pk)); insert into f1 select a from generate_series(1,1) a; insert into f2 select (random()*)::int+1 from generate_series(1,14); vacuum analyze; create index xxx on f2(fk); \timing postgres=> select count(*) from f1 join f2 on pk=fk; count 14 (1 row) Time: 538,254 ms drop index xxx; postgres=> select count(*) from f1 join f2 on pk=fk; count 14 (1 row) Time: 311,580 ms Plans: postgres=> explain select count(*) from f1 join f2 on pk=fk; QUERY PLAN -- Aggregate (cost=7788.00..7788.01 rows=1 width=0) -> Hash Join (cost=170.00..7438.00 rows=14 width=0) Hash Cond: (f2.fk = f1.pk) -> Seq Scan on f2 (cost=0.00..2018.00 rows=14 width=4) -> Hash (cost=145.00..145.00 rows=1 width=4) -> Seq Scan on f1 (cost=0.00..145.00 rows=1 width=4) (6 rows) postgres=> explain select count(*) from f1 join f2 on pk=fk; QUERY PLAN Aggregate (cost=6631.75..6631.76 rows=1 width=0) -> Merge Join (cost=0.00..6281.75 rows=14 width=0) Merge Cond: (f1.pk = f2.fk) -> Index Scan using f1_pkey on f1 (cost=0.00..187.00 rows=1 width=4) -> Index Scan using xxx on f2 (cost=0.00..4319.77 rows=14 width=4) (5 rows) PostgreSQL 8.1, Linux Regards Pavel Stehule _ Citite se osamele? Poznejte nekoho vyjmecneho diky Match.com. http://www.msn.cz/ ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Creating a Pseudocolumn
On Tue, May 16, 2006 at 03:18:19PM +0530, Gurjeet Singh wrote: >LEVEL might not be a part of the standard, but it is very handy > when dealing with hierarchical queries. > > The chapter 1 (http://www.oreilly.com/catalog/sqlpr/chapter/ch01.pdf) > of book 'SQL Pocket Guide' elaborates more on it. Looking at that text, it describes how to generate a LEVEL value using the WITH RECURSIVE method, it seems to me you could apply the same method to what you're doing. Just like how on UPDATE and DELETE queries an invisible "ctid" column is added, LEVEL would be an attribute of the tuple being passed up. I'm assuming that what's actually being implemented is the SQL standard method with the Oracle alternative being another way of specifying the same thing? Have a ncie day, -- Martijn van Oosterhout http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to > litigate. signature.asc Description: Digital signature
Re: [HACKERS] Wrong plan for simple join with index on FK
On Tue, May 16, 2006 at 11:52:05AM +0200, Pavel Stehule wrote: > Hello > > I test using index on foreign key. I found situation, when planner choose > worse plan. Can we seen an EXPLAIN ANALYZE output to see where the miscalculation lies. Is it underestimating the cost of the index scan, or overestimating the cost of the hash join. Have a nice day, -- Martijn van Oosterhout http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to > litigate. signature.asc Description: Digital signature
Re: [HACKERS] Wrong plan for simple join with index on FK
Can we seen an EXPLAIN ANALYZE output to see where the miscalculation lies. Is it underestimating the cost of the index scan, or overestimating the cost of the hash join. postgres=> explain analyze select count(*) from f1 join f2 on pk=fk; QUERY PLAN --- Aggregate (cost=6631.75..6631.76 rows=1 width=0) (actual time=2433.700..2433.703 rows=1 loops=1) -> Merge Join (cost=0.00..6281.75 rows=14 width=0) (actual time=0.055..1916.815 rows=14 loops=1) Merge Cond: (f1.pk = f2.fk) -> Index Scan using f1_pkey on f1 (cost=0.00..187.00 rows=1 width=4) (actual time=0.025..45.635 rows=1 loops=1) -> Index Scan using xxx on f2 (cost=0.00..4319.77 rows=14 width=4) (actual time=0.011..812.661 rows=14 loops=1) Total runtime: 2433.859 ms (6 rows) postgres=> explain analyze select count(*) from f1 join f2 on pk=fk; QUERY PLAN - Aggregate (cost=7788.00..7788.01 rows=1 width=0) (actual time=2216.490..2216.493 rows=1 loops=1) -> Hash Join (cost=170.00..7438.00 rows=14 width=0) (actual time=80.296..1712.505 rows=14 loops=1) Hash Cond: (f2.fk = f1.pk) -> Seq Scan on f2 (cost=0.00..2018.00 rows=14 width=4) (actual time=0.031..493.614 rows=14 loops=1) -> Hash (cost=145.00..145.00 rows=1 width=4) (actual time=80.201..80.201 rows=1 loops=1) -> Seq Scan on f1 (cost=0.00..145.00 rows=1 width=4) (actual time=0.025..37.587 rows=1 loops=1) Total runtime: 2216.730 ms (7 rows) Regards Pavel _ Emotikony a pozadi programu MSN Messenger ozivi vasi konverzaci. http://messenger.msn.cz/ ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] [BUGS] BUG #2429: Explain does not report object's schema
On 5/16/06, Martijn van Oosterhout wrote: On Tue, May 16, 2006 at 09:15:13AM +0530, Gurjeet Singh wrote: As has been pointed out, aliases ave always been displayed. The OPs problem was that he was using schema.tablename everywhere and explain didn't distinguish between schema1.mytable and schema2.mytable. It was suggested that he use aliases instead to make it work. I should subscribe to the -bugs mailing list too I didn't get the complete picture there. I don't think adding a schema_name to EXPLAIN's output, is really a good idea!! The ALIAS feature of the SQL language caters to this very need of assigning unambiguous names to tables. And we should be careful when adding any special code for EXPLAIN [ ANALYZE ]. For an example what would happen if we do that, consider this: If a big JOIN query takes N minutes on MS SQL Server, and I wish to see the plan why it is taking that long, one would expect me to enable 'Show Execution Plan' in the SQL Query Analyzer (similar to PG's EXPLAIN ANALYZE). And when I do that, the query now takes more than 2*N minutes to come back. I assume this extra delay is caused by the overhead of extra processing it does at row-source level (like how many rows passed through this row-source, average row-size, etc.). ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] [BUGS] BUG #2429: Explain does not report object's schema
On Tue, May 16, 2006 at 04:36:36PM +0530, Gurjeet Singh wrote: >If a big JOIN query takes N minutes on MS SQL Server, and I wish > to see the plan why it is taking that long, one would expect me to > enable 'Show Execution Plan' in the SQL Query Analyzer (similar to > PG's EXPLAIN ANALYZE). And when I do that, the query now takes more > than 2*N minutes to come back. I assume this extra delay is caused by > the overhead of extra processing it does at row-source level (like how > many rows passed through this row-source, average row-size, etc.). I posted a patch last week on -patches which should dramatically cut the overhead of EXPLAIN ANALYZE. Perhaps you could try that and report your experience. http://archives.postgresql.org/pgsql-patches/2006-05/msg00158.php BTW, just showing the plan takes no time at all, just use EXPLAIN without the ANALYZE. Have a nice day, -- Martijn van Oosterhout http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to > litigate. signature.asc Description: Digital signature
Re: [HACKERS] Wrong plan for simple join with index on FK
On Tue, May 16, 2006 at 12:54:58PM +0200, Pavel Stehule wrote: > > > >Can we seen an EXPLAIN ANALYZE output to see where the miscalculation > >lies. Is it underestimating the cost of the index scan, or > >overestimating the cost of the hash join. > postgres=> explain analyze select count(*) from f1 join f2 on pk=fk; First query (merge join): Apart from the apparent overestimation of the cost of a full index scan over xxx by about 30%, there seems to be a significant underestimation of the cost of the merge join. > postgres=> explain analyze select count(*) from f1 join f2 on pk=fk; Second query (hash join): Here the estimates seem to be fine, except for an apparent underestimation of the cost of the aggregate. These are all minor abberations though, on the whole the estimates are pretty good. Perhaps you need to tweak the values of random_page_cost and similar variables. Have a ncie day, -- Martijn van Oosterhout http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to > litigate. signature.asc Description: Digital signature
Re: [HACKERS] Wrong plan for simple join with index on FK
These are all minor abberations though, on the whole the estimates are pretty good. Perhaps you need to tweak the values of random_page_cost and similar variables. Thank You, It's general problem or only mine? I have "100%" standard current PC. Pavel _ Najdete si svou lasku a nove pratele na Match.com. http://www.msn.cz/ ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Wrong plan for simple join with index on FK
> >These are all minor abberations though, on the whole the estimates are > >pretty good. Perhaps you need to tweak the values of random_page_cost > >and similar variables. > > Thank You, It's general problem or only mine? I have "100%" > standard current PC. The default random_page_cost assumes some concurrent activity. If your PC does nothing else concurrently, the performance of a seq scan will be underestimated. Try to do the statement with some concurrent disk load and you will most likely see that the 1. plan is faster. (assuming the tables are not fully cached) Andreas ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Wrong plan for simple join with index on FK
> >These are all minor abberations though, on the whole the estimates are > >pretty good. Perhaps you need to tweak the values of random_page_cost > >and similar variables. > > Thank You, It's general problem or only mine? I have "100%" > standard current PC. The default random_page_cost assumes some concurrent activity. If your PC does nothing else concurrently, the performance of a seq scan will be underestimated. Try to do the statement with some concurrent disk load and you will most likely see that the 1. plan is faster. (assuming the tables are not fully cached) Andreas ok. I tested it with pgbench and it's true. With -c 50 merge_join is faster. I didn't expect it. Thank You Pavel _ Citite se osamele? Poznejte nekoho vyjmecneho diky Match.com. http://www.msn.cz/ ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] psql feature thought
Michael Glaesemann <[EMAIL PROTECTED]> writes: > On May 16, 2006, at 14:17 , Tom Lane wrote: >> [1] Mostly. There's the infamous continued-string-literal >> construct... > Are you referring to this? > ^ > test=# SELECT 'foo' > test-# 'bar'; -- valid Yeah. It seems weird that a newline should be *required* there. I think I understand why the SQL committee did that: they thought SELECT 'a' 'b' was much too likely to be a syntax error and shouldn't be made into a valid construct. But it's weird nonetheless, and awkward to implement. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Creating a Pseudocolumn
Martijn van Oosterhout writes: > I'm assuming that what's actually being implemented is the SQL standard > method with the Oracle alternative being another way of specifying the > same thing? What's being implemented should be the standard. Full stop. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[HACKERS] Help!!!!
Hi to all!I am a student in Computer Science, and in Databases' Technologycourse I must to writea report on physical structures of DBMS's. I choosed PostgreSQL; Ilooked the sources, andI have understood the block structure. And relations? And databases?How are structured?Thanks!!! Yahoo! Mail: gratis 1GB per i messaggi, antispam, antivirus, POP3
Re: [HACKERS] Compression and on-disk sorting
> > Compressed-filesystem extension (like e2compr, and I think either > Fat or NTFS) can do that. > Windows (NT/2000/XP) can compress individual directories and files under NTFS; new files in a compressed directory are compressed by default. So if the 'spill-to-disk' all happened in its own specific directory, it would be trivial to mark that directory for compression. I don't know enough Linux/Unix to know if it has similar capabilities. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Creating a Pseudocolumn
On 5/16/06, Martijn van Oosterhout wrote: I'm assuming that what's actually being implemented is the SQL standard method with the Oracle alternative being another way of specifying the same thing? I'm working on both versions of hierarchical queries; CONNECT BY for EnterpriseDB and WITH [RECURSIVE] for PostgreSQL. Currently, they're completely separate implementations, but in a few months I should be able to translate CONNECT BY into actual WITH syntax. Anyway, I'm working on the CONNECT BY version and am looking at how best to implement the level pseudocolumn. In the pseudocolumn respect, this is somewhat similar to the rownum discussion. I've added a system attribute to handle level, and it does work. But, I'm thinking there's going to be a couple gotcha's hidden in there somewhere. -- Jonah H. Harris, Software Architect | phone: 732.331.1300 EnterpriseDB Corporation| fax: 732.331.1301 33 Wood Ave S, 2nd Floor| [EMAIL PROTECTED] Iselin, New Jersey 08830| http://www.enterprisedb.com/ ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Help!!!!
On 5/16/06, winlinchu <[EMAIL PROTECTED]> wrote: I must to write a report on physical structures of DBMS's. I choosed PostgreSQL; I looked the sources, and I have understood the block structure. And relations? And databases? How are structured? Is there a question in there? If you're looking for database and relation structures, see the headers. There you'll find all the info you need for the database structures used. Otherwise, see the READMEs in the code or http://www.postgresql.org/docs/8.1/static/internals.html Sorry, but giving you any more info seems like we're doing the work for ya. -- Jonah H. Harris, Software Architect | phone: 732.331.1300 EnterpriseDB Corporation| fax: 732.331.1301 33 Wood Ave S, 2nd Floor| [EMAIL PROTECTED] Iselin, New Jersey 08830| http://www.enterprisedb.com/ ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] [BUGS] BUG #2429: Explain does not report object's schema
Gurjeet Singh wrote: > I don't think Cristiano is asking for the schema_name in the > EXPLAIN o/p. In fact I'm requesting for schema_name in EXPLAIN o/p. > The request is for the table ALIASes to be shown in the > o/p, which makes more sense than schema_name+table_name, since the > same table can be used in the same query more than once. SQL table aliases doesn't help locating the real place where the table is. If I have a table named "test" at the schema "place" and I do: "EXPLAIN SELECT * FROM place.test mytest" I will get: "Seq Scan on test mytest" With this output I know that "mytest" is an alias to "test", and that's great, much helpful than aliases only, but, where is "test"? Explain didn't provide me will the location: I know the table name but I don't know where it is. I don't see too much harm if the output was: "Seq Scan on place.test mytest" And that's what I'm asking for. Regards, Cristiano ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Compression and on-disk sorting
Bort, Paul wrote: Compressed-filesystem extension (like e2compr, and I think either Fat or NTFS) can do that. Windows (NT/2000/XP) can compress individual directories and files under NTFS; new files in a compressed directory are compressed by default. So if the 'spill-to-disk' all happened in its own specific directory, it would be trivial to mark that directory for compression. I don't know enough Linux/Unix to know if it has similar capabilities. Or would want to ... I habitually turn off all compression on my Windows boxes, because it's a performance hit in my experience. Disk is cheap ... cheers andrew ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Compression and on-disk sorting
On Tue, 2006-05-16 at 11:53 -0400, Andrew Dunstan wrote: > Bort, Paul wrote: > >> Compressed-filesystem extension (like e2compr, and I think either > >> Fat or NTFS) can do that. > >> > >> > > > > Windows (NT/2000/XP) can compress individual directories and files under > > NTFS; new files in a compressed directory are compressed by default. > > > > So if the 'spill-to-disk' all happened in its own specific directory, it > > would be trivial to mark that directory for compression. > > > > I don't know enough Linux/Unix to know if it has similar capabilities. > Or would want to ... > > I habitually turn off all compression on my Windows boxes, because it's > a performance hit in my experience. Disk is cheap ... Disk storage is cheap. Disk bandwidth or throughput is very expensive. -- ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Compression and on-disk sorting
Rod Taylor wrote: I habitually turn off all compression on my Windows boxes, because it's a performance hit in my experience. Disk is cheap ... Disk storage is cheap. Disk bandwidth or throughput is very expensive. Sure, but in my experience using Windows File System compression is not a win here. Presumably if it were an unqualified win they would have it turned on everywhere. The fact that there's an option is a good indication that it isn't in many cases. It is most commonly used for files like executables that are in effect read-only - but that doesn't help us. cheers andrew ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] [PATCHES] .pgpass file and unix domain sockets
hackers - any opinions? The biggest downside would be that a pgpass file would be version specific for this feature. The badness of this is somewhat mitigated by the ability we now have to specify an alternative pgpassfile location. cheers andrew Original Message Subject:Re: [PATCHES] .pgpass file and unix domain sockets Date: Tue, 16 May 2006 12:16:53 -0400 From: Tom Lane <[EMAIL PROTECTED]> To: Andrew Dunstan <[EMAIL PROTECTED]> CC: Joachim Wieland <[EMAIL PROTECTED]>, pgsql-patches@postgresql.org References: <[EMAIL PROTECTED]> <[EMAIL PROTECTED]> <[EMAIL PROTECTED]> <[EMAIL PROTECTED]> <[EMAIL PROTECTED]> Andrew Dunstan <[EMAIL PROTECTED]> writes: Tom Lane wrote: Personally I wouldn't object to making it match "localhost" in all cases. That's what the documentation says, and the use-case for doing something more complicated seems pretty thin. I almost agree. If anything, I'd prefer to provide for an explicit entry covering all Unix Domain sockets - it took me by some surprise to find a while back that "localhost" covers that case - it seems a mismatch with how pg_hba.conf works. Well, that'd break existing .pgpass files (unless we match localhost too, which seems to defeat the purpose). But maybe it's worth doing for consistency's sake. I think we should bring it up on a more widely read list than -patches if you want to propose a non-backwards-compatible change ... ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] [PATCHES] .pgpass file and unix domain sockets
Uh, why wouldn't we allow "localhost" to match the default unix domain socket name, as well as an empty hostname? If you specify a non-default location, you then have to specify the full path. --- Andrew Dunstan wrote: > > hackers - any opinions? > > The biggest downside would be that a pgpass file would be version > specific for this feature. The badness of this is somewhat mitigated by > the ability we now have to specify an alternative pgpassfile location. > > cheers > > andrew > > Original Message > Subject: Re: [PATCHES] .pgpass file and unix domain sockets > Date: Tue, 16 May 2006 12:16:53 -0400 > From: Tom Lane <[EMAIL PROTECTED]> > To: Andrew Dunstan <[EMAIL PROTECTED]> > CC: Joachim Wieland <[EMAIL PROTECTED]>, pgsql-patches@postgresql.org > References: <[EMAIL PROTECTED]> > <[EMAIL PROTECTED]> <[EMAIL PROTECTED]> > <[EMAIL PROTECTED]> <[EMAIL PROTECTED]> > > > > Andrew Dunstan <[EMAIL PROTECTED]> writes: > > Tom Lane wrote: > >> Personally I wouldn't object to making it match "localhost" in all > >> cases. That's what the documentation says, and the use-case for > >> doing something more complicated seems pretty thin. > > > I almost agree. If anything, I'd prefer to provide for an explicit entry > > covering all Unix Domain sockets - it took me by some surprise to find a > > while back that "localhost" covers that case - it seems a mismatch with > > how pg_hba.conf works. > > Well, that'd break existing .pgpass files (unless we match localhost > too, which seems to defeat the purpose). But maybe it's worth doing > for consistency's sake. I think we should bring it up on a more > widely read list than -patches if you want to propose a > non-backwards-compatible change ... > > > > > > ---(end of broadcast)--- > TIP 1: if posting/reading through Usenet, please send an appropriate >subscribe-nomail command to [EMAIL PROTECTED] so that your >message can get through to the mailing list cleanly > -- Bruce Momjian http://candle.pha.pa.us EnterpriseDBhttp://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] audit table containing Select statements submitted
Doug, > But what if the user calls the access function, sees the data, then > issues a ROLLBACK? The audit record would be rolled back as well (as > Tom pointed out earlier). > > You could use dblink to log to a separate audit database, I suppose. Or just write to some other non-transational resource, like a text file. That would require the use of an external "untrusted" PL, though (like PL/PerlU, PL/sh or PL/PythonU) -- Josh Berkus Aglio Database Solutions San Francisco ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Compression and on-disk sorting
On Tue, May 16, 2006 at 09:24:38AM +0200, Zeugswetter Andreas DCP SD wrote: > > > > Given that any time that happens we end up caring much less about > CPU > > > usage and much more about disk IO, for any of these cases that use > > > non-random access, compressing the data before sending it to disk > would > > > potentially be a sizeable win. > > > > Note however that what the code thinks is a spill to disk and what > > actually involves disk I/O are two different things. If you think > > of it as a spill to kernel disk cache then the attraction is a lot > > weaker... > > Yes, that is very true. However it would also increase the probability > that spill to disk is not needed, since more data fits in RAM. That's a pretty thin margin though, depending on how good the compression is. This also assumes that you have a compression algorithm that supports random access. -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Compression and on-disk sorting
On Tue, May 16, 2006 at 12:27:42PM -0400, Andrew Dunstan wrote: > Rod Taylor wrote: > >>I habitually turn off all compression on my Windows boxes, because it's > >>a performance hit in my experience. Disk is cheap ... > >> > > > >Disk storage is cheap. Disk bandwidth or throughput is very expensive. Hey, that's my line! :P > Sure, but in my experience using Windows File System compression is not > a win here. Presumably if it were an unqualified win they would have it > turned on everywhere. The fact that there's an option is a good > indication that it isn't in many cases. It is most commonly used for > files like executables that are in effect read-only - but that doesn't > help us. The issue with filesystem level compression is that it has to support things like random access, which isn't needed for on-disk sorting (not sure about other things like hashing, etc). In any case, my curiousity is aroused, so I'm currently benchmarking pgbench on both a compressed and uncompressed $PGDATA/base. I'll also do some benchmarks with pg_tmp compressed. Does anyone have time to hack some kind of compression into the on-disk sort code just to get some benchmark numbers? Unfortunately, doing so is beyond my meager C abilitiy... -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] [BUGS] BUG #2429: Explain does not report object's schema
On Tue, May 16, 2006 at 04:36:36PM +0530, Gurjeet Singh wrote: >I don't think adding a schema_name to EXPLAIN's output, is really > a good idea!! The ALIAS feature of the SQL language caters to this > very need of assigning unambiguous names to tables. That's assuming that the query was written with aliases. Otherwise you have to go back and manually add them, which can be a royal pain for a large, complex query. >And we should be careful when adding any special code for EXPLAIN > [ ANALYZE ]. For an example what would happen if we do that, consider > this: > >If a big JOIN query takes N minutes on MS SQL Server, and I wish > to see the plan why it is taking that long, one would expect me to > enable 'Show Execution Plan' in the SQL Query Analyzer (similar to > PG's EXPLAIN ANALYZE). And when I do that, the query now takes more > than 2*N minutes to come back. I assume this extra delay is caused by > the overhead of extra processing it does at row-source level (like how > many rows passed through this row-source, average row-size, etc.). How does that have anything to do with adding query names to EXPLAIN output?? The only argument against this that makes any sense to me is that EXPLAIN is plenty verbose as it is, and we don't need to be making it worse. But that's a non-issue if showing the schema names is optional. One way to address this would be to add a verbosity level to EXPLAIN. Right now, EXPLAIN VERBOSE is pretty useless to users, but there is additional information that would be handy to get from explain at different levels: "side effect" timing, such as time spent in triggers, constraints, etc. This is there in HEAD for triggers. Information about what other plans were considered. More explicit naming information. Information about statements that ran inside a function (ie: EXPLAIN SELECT function_name() is pretty useless today). Having a means to specify a verbosity level would allow for adding these kind of features without needlessly cluttering up a run-of-the-mill EXPLAIN. -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] [BUGS] BUG #2429: Explain does not report object's schema
I agree... VERBOSE option can be made parameterised to include additional information in the EXPLAIN's output. I also agree that adding the schema name wouldn't add any overhead, and I support Tom's suggestion: 'Possibly a reasonable compromise would be for EXPLAIN to act like rule reverse-listing does,' But one should be wary of adding any other option that itself might cause an overhead, especially when doing the ANALYZE. For example, from the ones you suggested, 'showing other plans considered by the optimizer' seems a bit of an overhead. As the number of JOINed tables increase, so does the number of join permutations, and trying to keep the plans (in any form) till we send the results to client, would block-up considerable amount of resources. On the other hand, we can add these options and keep a note in docs saying that the presence of these particular parameters (to VERBOSE) will affect performance, and if used in conjunction with ANALYZE, ANALYZE might not give you the correct picture! On 5/16/06, Jim C. Nasby <[EMAIL PROTECTED]> wrote: How does that have anything to do with adding query names to EXPLAIN output?? The only argument against this that makes any sense to me is that EXPLAIN is plenty verbose as it is, and we don't need to be making it worse. But that's a non-issue if showing the schema names is optional. One way to address this would be to add a verbosity level to EXPLAIN. Right now, EXPLAIN VERBOSE is pretty useless to users, but there is additional information that would be handy to get from explain at different levels: ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] [BUGS] BUG #2429: Explain does not report object's schema
Cristiano Duarte wrote: > SQL table aliases doesn't help locating the real place where the table is. > If I have a table named "test" at the schema "place" and I do: > > "EXPLAIN SELECT * FROM place.test mytest" > > I will get: > > "Seq Scan on test mytest" > > With this output I know that "mytest" is an alias to "test", and that's > great, much helpful than aliases only, but, where is "test"? Since you created the mytest alias, you sure know where it's pointing to. In fact I'd argue that this should instead display Seq Scan on mytest > I don't see too much harm if the output was: > > "Seq Scan on place.test mytest" Not much harm there, but there will be plenty harm on other node types where the output is already too wide. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] psql feature thought
Joshua D. Drake wrote: > Hello, > > I was dinking around wand came across something that may (or may not be > useful). > > What if single line statements that were seperated by ; within psql were > implicitly within a transaction? > > E.g; > > postgres=# select * from foo; update foo set bar = 'baz'; delete from bing; > > Would be a single transaction ? The begin/commit would be implicit. > > Joshua D. Drake > > > In psql (and in bash, etc) I like to type multiple commands separated by ';' if I want them all to run but I don't want to wait for the first to complete before typing the second, third, etc. (Possibly I would rather go get a coffee while the commands are running.) So, I might type: psql# ; and let it fly. It would violate the principle of least surprise if doing this on one line caused the semantics to change. mark ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Compression and on-disk sorting
On Tue, May 16, 2006 at 12:31:07PM -0500, Jim C. Nasby wrote: > In any case, my curiousity is aroused, so I'm currently benchmarking > pgbench on both a compressed and uncompressed $PGDATA/base. I'll also do > some benchmarks with pg_tmp compressed. Results: http://jim.nasby.net/bench.log As expected, compressing $PGDATA/base was a loss. But compressing pgsql_tmp and then doing some disk-based sorts did show an improvement, from 366.1 seconds to 317.3 seconds, an improvement of 13.3%. This is on a Windows XP laptop (Dell Latitude D600) with 512MB, so it's somewhat of a worst-case scenario. On the other hand, XP's compression algorithm appears to be pretty aggressive, as it cut the size of the on-disk sort file from almost 700MB to 82MB. There's probably gains to be had from a different compression algorithm. > Does anyone have time to hack some kind of compression into the on-disk > sort code just to get some benchmark numbers? Unfortunately, doing so is > beyond my meager C abilitiy... -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Compression and on-disk sorting
On Tue, May 16, 2006 at 12:31:07PM -0500, Jim C. Nasby wrote: > Does anyone have time to hack some kind of compression into the on-disk > sort code just to get some benchmark numbers? Unfortunately, doing so is > beyond my meager C abilitiy... I had a look at this. At first glance it doesn't seem too hard, except the whole logtape process kinda gets in the way. If it wern't for the mark/restore it'd be trivial. Might take a stab at it some time, if I can think of a way to handle the seeking... -- Martijn van Oosterhout http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to > litigate. signature.asc Description: Digital signature
Re: [HACKERS] Compression and on-disk sorting
On Tue, May 16, 2006 at 11:46:15PM +0200, Martijn van Oosterhout wrote: > On Tue, May 16, 2006 at 12:31:07PM -0500, Jim C. Nasby wrote: > > Does anyone have time to hack some kind of compression into the on-disk > > sort code just to get some benchmark numbers? Unfortunately, doing so is > > beyond my meager C abilitiy... > > I had a look at this. At first glance it doesn't seem too hard, except > the whole logtape process kinda gets in the way. If it wern't for the > mark/restore it'd be trivial. Might take a stab at it some time, if I > can think of a way to handle the seeking... Oh, do we need to randomly seek? Is that how we switch from one tape to another? It might be easier to switch to giving each tape it's own file... -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Compression and on-disk sorting
On Tue, May 16, 2006 at 04:50:22PM -0500, Jim C. Nasby wrote: > > I had a look at this. At first glance it doesn't seem too hard, except > > the whole logtape process kinda gets in the way. If it wern't for the > > mark/restore it'd be trivial. Might take a stab at it some time, if I > > can think of a way to handle the seeking... > > Oh, do we need to randomly seek? Is that how we switch from one tape to > another? Not seek, mark/restore. As the code describes, sometimes you go back a tuple. The primary reason I think is for the final pass, a merge sort might read the tuples multiple times, so it needs to support it there. > It might be easier to switch to giving each tape it's own file... I don't think it would make much difference. OTOH, if this turns out to be a win, the tuplestore could have the same optimisation. Have a nice day, -- Martijn van Oosterhout http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to > litigate. signature.asc Description: Digital signature
Re: [HACKERS] Compression and on-disk sorting
Martijn van Oosterhout writes: > Not seek, mark/restore. As the code describes, sometimes you go back a > tuple. The primary reason I think is for the final pass, a merge sort > might read the tuples multiple times, so it needs to support it there. However it'd be possible to tell logtape in advance whether a particular tape needs to support that, and only apply compression when not; it would work all the time for intermediate merge passes, and with the recent executor changes to pass down you-need-to-support-mark flags, it'd work for the output pass in a lot of cases too. If you're just trying to get some quick and dirty numbers: do compression, replace Seek/Tell with PANICs, and only test on plain sorts no joins ;-) regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Compression and on-disk sorting
Martijn van Oosterhout writes: > > It might be easier to switch to giving each tape it's own file... > > I don't think it would make much difference. OTOH, if this turns out to > be a win, the tuplestore could have the same optimisation. Would giving each tape its own file make it easier to allow multiple temporary sort areas and allow optimizing to avoid seeking when multiple spindles area available? -- greg ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[HACKERS] PL/pgSQL 'i = i + 1' Syntax
Hellow PostgreSQL hackers, Quick question. Why does the 'i = i + 1' syntax work in this PL/pgSQL function? try=# CREATE OR REPLACE FUNCTION inc_by_two( try(#upfrom int, try(#upto int try(# ) RETURNS SETOF INT AS $$ try$# BEGIN try$# FOR i IN upfrom..upto LOOP try$# RETURN NEXT i; try$# i = i + 1; try$# END LOOP; try$# END; try$# $$ LANGUAGE 'plpgsql'; CREATE FUNCTION try=# select * from inc_by_two(1,10); inc_by_two 1 3 5 7 9 (5 rows) Someone posted a PL/pgSQL function in my blog with this syntax, which is how I know about it, but I couldn't find it documented anywhere. Is it a special exception for loop variables, perhaps? Thanks, David ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] PL/pgSQL 'i = i + 1' Syntax
David Wheeler wrote: Hellow PostgreSQL hackers, Quick question. Why does the 'i = i + 1' syntax work in this PL/pgSQL function? try=# CREATE OR REPLACE FUNCTION inc_by_two( try(#upfrom int, try(#upto int try(# ) RETURNS SETOF INT AS $$ try$# BEGIN try$# FOR i IN upfrom..upto LOOP try$# RETURN NEXT i; try$# i = i + 1; try$# END LOOP; try$# END; try$# $$ LANGUAGE 'plpgsql'; CREATE FUNCTION try=# select * from inc_by_two(1,10); inc_by_two 1 3 5 7 9 (5 rows) Someone posted a PL/pgSQL function in my blog with this syntax, which is how I know about it, but I couldn't find it documented anywhere. Is it a special exception for loop variables, perhaps? It ought to be illegal to modify the loop control variable anyway, IMNSHO - it certainly is in Ada, the distant antecedent of pl/pgsql. cheers andrew ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] PL/pgSQL 'i = i + 1' Syntax
David Wheeler wrote: > Hellow PostgreSQL hackers, > > Quick question. Why does the 'i = i + 1' syntax work in this PL/pgSQL > function? > > try=# CREATE OR REPLACE FUNCTION inc_by_two( > try(#upfrom int, > try(#upto int > try(# ) RETURNS SETOF INT AS $$ > try$# BEGIN > try$# FOR i IN upfrom..upto LOOP > try$# RETURN NEXT i; > try$# i = i + 1; > try$# END LOOP; > try$# END; > try$# $$ LANGUAGE 'plpgsql'; > CREATE FUNCTION > try=# select * from inc_by_two(1,10); > inc_by_two > > 1 > 3 > 5 > 7 > 9 > (5 rows) > > Someone posted a PL/pgSQL function in my blog with this syntax, which > is how I know about it, but I couldn't find it documented anywhere. Is > it a special exception for loop variables, perhaps? > > Thanks, > > David > > ---(end of broadcast)--- > TIP 2: Don't 'kill -9' the postmaster > The syntax for assignment is: i := i + 1 what you are doing is merely comparison. Since you are not using the results of the comparison, it is a no-op. mark ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] PL/pgSQL 'i = i + 1' Syntax
On May 16, 2006, at 16:30, Andrew Dunstan wrote: It ought to be illegal to modify the loop control variable anyway, IMNSHO - it certainly is in Ada, the distant antecedent of pl/pgsql. I agree, but I must say that it's incredibly useful to be able to increment by two as I go through a loop: FOR i IN 1 + offset .. 11 + offset LOOP total := total + substring(ean, i, 1)::INTEGER; i = i + 1; END LOOP; Best, David ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] PL/pgSQL 'i = i + 1' Syntax
Mark Dilger wrote: > David Wheeler wrote: > >>Hellow PostgreSQL hackers, >> >>Quick question. Why does the 'i = i + 1' syntax work in this PL/pgSQL >>function? >> >>try=# CREATE OR REPLACE FUNCTION inc_by_two( >>try(#upfrom int, >>try(#upto int >>try(# ) RETURNS SETOF INT AS $$ >>try$# BEGIN >>try$# FOR i IN upfrom..upto LOOP >>try$# RETURN NEXT i; >>try$# i = i + 1; >>try$# END LOOP; >>try$# END; >>try$# $$ LANGUAGE 'plpgsql'; >>CREATE FUNCTION >>try=# select * from inc_by_two(1,10); >>inc_by_two >> >> 1 >> 3 >> 5 >> 7 >> 9 >>(5 rows) >> >>Someone posted a PL/pgSQL function in my blog with this syntax, which >>is how I know about it, but I couldn't find it documented anywhere. Is >>it a special exception for loop variables, perhaps? >> >>Thanks, >> >>David >> >>---(end of broadcast)--- >>TIP 2: Don't 'kill -9' the postmaster >> > > > The syntax for assignment is: > > i := i + 1 > > what you are doing is merely comparison. Since you are not using the results > of > the comparison, it is a no-op. > > mark So I don't know why it works for you. I wrote the following, and it also increments the variable: CREATE OR REPLACE FUNCTION weird () RETURNS SETOF INT AS $$ DECLARE i integer; BEGIN i := 0; return next i; i = i + 1; return next i; i = i + 1; return next i; return; END; $$ LANGUAGE plpgsql; So I don't think it has anything to do with loop variables, specifically. mark ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] PL/pgSQL 'i = i + 1' Syntax
On May 16, 2006, at 16:42, Mark Dilger wrote: So I don't know why it works for you. I wrote the following, and it also increments the variable: CREATE OR REPLACE FUNCTION weird () RETURNS SETOF INT AS $$ DECLARE i integer; BEGIN i := 0; return next i; i = i + 1; return next i; i = i + 1; return next i; return; END; $$ LANGUAGE plpgsql; So I don't think it has anything to do with loop variables, specifically. Indeed. It appears that, contrary to what I previously thought, := also works: CREATE OR REPLACE FUNCTION inc_by_two( upfrom int, upto int ) RETURNS SETOF INT AS $$ BEGIN FOR i IN upfrom..upto LOOP RETURN NEXT i; i := i + 1; END LOOP; END; $$ LANGUAGE 'plpgsql'; try=# select * from inc_by_two(1,11); inc_by_two 1 3 5 7 9 11 (6 rows) Best, David ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] PL/pgSQL 'i = i + 1' Syntax
On May 16, 2006, at 16:53, Mark Dilger wrote: Sorry, I meant to say that it should only be a no-op according to the language specification, as I understand it. The fact that it works suggests to me that the implementation of PL/pgsql has been modified (or broken?). Whether the change is a bug or an intentional feature change, I don't know. Ah, yes, I agree, and that's why I posted my query. mark P.S., Looking through the documentation, I don't immediately see the spec for how a regular '=' is supposed to work, but assignment is documented as using the ':=' syntax and says nothing about '='. Exactly. But I use = all the time for comparision: IF FOUND = TRUE THEN END IF Best, David ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] PL/pgSQL 'i = i + 1' Syntax
David Wheeler wrote: > On May 16, 2006, at 16:53, Mark Dilger wrote: > >> Sorry, I meant to say that it should only be a no-op according to the >> language >> specification, as I understand it. The fact that it works suggests >> to me that >> the implementation of PL/pgsql has been modified (or broken?). >> Whether the >> change is a bug or an intentional feature change, I don't know. > > > Ah, yes, I agree, and that's why I posted my query. > >> mark >> >> P.S., Looking through the documentation, I don't immediately see the >> spec for >> how a regular '=' is supposed to work, but assignment is documented >> as using the >> ':=' syntax and says nothing about '='. > > > Exactly. But I use = all the time for comparision: > > IF FOUND = TRUE THEN > > END IF > > Best, > > David It seems this has been answered before, by Tom Lane: http://archives.postgresql.org/pgsql-novice/2006-04/msg00138.php ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] PL/pgSQL 'i = i + 1' Syntax
On May 16, 2006, at 17:02, Mark Dilger wrote: It seems this has been answered before, by Tom Lane: http://archives.postgresql.org/pgsql-novice/2006-04/msg00138.php Ah, cool, then it *is* known. Thanks, David ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Compression and on-disk sorting
On Tue, May 16, 2006 at 12:31:07PM -0500, Jim C. Nasby wrote: > On Tue, May 16, 2006 at 12:27:42PM -0400, Andrew Dunstan wrote: > > Rod Taylor wrote: > > >>I habitually turn off all compression on my Windows boxes, because it's > > >>a performance hit in my experience. Disk is cheap ... > > > > > >Disk storage is cheap. Disk bandwidth or throughput is very expensive. > > Sure, but in my experience using Windows File System compression is not > > a win here. Presumably if it were an unqualified win they would have it > Does anyone have time to hack some kind of compression into the on-disk > sort code just to get some benchmark numbers? Unfortunately, doing so is > beyond my meager C abilitiy... Folks, first of all, I'm in no way an expert on data compression in RDBMSs, but other databases DO include data compression features and claim it as a SIGNIFICANT win in I/O reduction. Looking at performance of the Windows File System compression, etc., doesn't make too much sense when there are actual RDBMS compression implementations to compare to, on the commerical market, in open source code, and in the academic literature. Oracle has included "table compression" since 9iR2. They report table size reductions of 2x to 4x as typical, with proportional reductions in I/O, and supposedly, usually low to negligible overhead for writes: http://download-east.oracle.com/docs/cd/B19306_01/server.102/b14211/build_db.htm#sthref289 Decision Speed: Table Compression In Action by Meikel Poess and Hermann Baer (2003): http://www.oracle.com/technology/oramag/webcolumns/2003/techarticles/poess_tablecomp.html Compressing Data for Space and Speed by Sanjay Mishra (2004): http://www.oracle.com/technology/oramag/oracle/04-mar/o24tech_data.html Order For Maximum Compression: http://oramossoracle.blogspot.com/2005/11/table-compression-order-for-maximum.html I don't remember whether the current (Open Source) MonetDB includes table compression or not, but they've published papers with lots of interesting detail on the compression and other high performance OLAP features in their latest (not released) "X100" MoneyDB research codebase: http://monetdb.cwi.nl/ http://homepages.cwi.nl/~mk/MonetDB/ http://sourceforge.net/projects/monetdb/ ftp://ftp.research.microsoft.com/pub/debull/A05june/issue1.htm Now, the docs and papers above are all focused on query performance, they say nothing directly about using using compression for on-disk sorts. But, I'd bet that similar rules of thumb will apply in both cases. The main tricks seem to be: One, EXTREMELY lightweight compression schemes - basically table lookups designed to be as cpu friendly as posible. Two, keep the data compressed in RAM as well so that you can also cache more of the data, and indeed keep it the compressed until as late in the CPU processing pipeline as possible. A corrolary of that is forget compression schemes like gzip - it reduces data size nicely but is far too slow on the cpu to be particularly useful in improving overall throughput rates. Note, I have not really tested ANY of the above myself, your mileage may well vary from what I recall from those various articles... -- Andrew Piskorski <[EMAIL PROTECTED]> http://www.piskorski.com/ ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] PL/pgSQL 'i = i + 1' Syntax
On 5/16/06, David Wheeler <[EMAIL PROTECTED]> wrote: On May 16, 2006, at 16:30, Andrew Dunstan wrote: > It ought to be illegal to modify the loop control variable anyway, > IMNSHO - it certainly is in Ada, the distant antecedent of pl/pgsql. I agree, but I must say that it's incredibly useful to be able to increment by two as I go through a loop: FOR i IN 1 + offset .. 11 + offset LOOP total := total + substring(ean, i, 1)::INTEGER; i = i + 1; END LOOP; Best, David i have posted a patch to add a BY clause to the for statement (integer version), with the BY clause you can specify an increment value... it's in the unapplied patches list waiting for review... http://candle.pha.pa.us/mhonarc/patches/msg3.html -- regards, Jaime Casanova "Programming today is a race between software engineers striving to build bigger and better idiot-proof programs and the universe trying to produce bigger and better idiots. So far, the universe is winning." Richard Cook ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] PL/pgSQL 'i = i + 1' Syntax
On May 16, 2006, at 17:51, Jaime Casanova wrote: i have posted a patch to add a BY clause to the for statement (integer version), with the BY clause you can specify an increment value... it's in the unapplied patches list waiting for review... http://candle.pha.pa.us/mhonarc/patches/msg3.html Ah, /me likes! Any chance that'll get in for 8.2? Best, David ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] pg_dump and backslash escapes
I have seen no reply to my suggestion below, so I assume it is the way people want to go for 7.3, 7.4, and 8.0. --- Bruce Momjian wrote: > Tom Lane wrote: > > Bruce Momjian writes: > > > -- Set escape environment for possible loading into version >= 8.2. > > > -- If variables are not supported, suppress error messages. > > > SET client_min_messages = panic; > > > SET log_min_messages = log; > > > SET log_min_error_statement = panic; > > > SET escape_string_warning = off; > > > SET standard_conforming_strings = off; > > > RESET log_min_error_statement; > > > RESET log_min_messages; > > > RESET client_min_messages; > > > > Thrashing about with the message level settings like that is useless. > > Either the command will work or it won't. And we've not bothered to > > try to suppress warnings for any of the other SET commands pg_dump > > issues. AFAICS all you've accomplished here is to make the dump > > Well, the issue with back branches is there is no way to know if the > dump is going to be loaded into the same back-branch, or a newer one, > so I figured we would have to suppress any SET because in the existing > branch, it would fail. > > We have discussed the idea of coding the PostgreSQL version number in > the dump somehow so the backend could adjust its behavior based on that > version. For example, you do SET pg_dump_version = 7.3 and sql standard > strings and backslash warnings are turned off. That could be > backpatched, I guess. > > > dependent on even more GUC variables than it needs to be (consider > > what will happen if we remove/redefine the log level variables in > > future). > > True. > > > I don't particularly like the way that pg_dump is behaving at the > > moment, ie cluttering the output with E'' strings. That makes it > > Well, it should only do that if there is a backslash in the string. I > tested a CHECK string and that is how it behaved. > > > unnecessarily hard to use the output to load into other databases > > or older PG versions. What I'd like to do is SET > > True. > > > standard_conforming_strings appropriately (this probably has to be > > a command line option, since it'll depend on where you want to use > > the output) and then not use E'' strings at all. > > Yes, we could do that, but are you saying it is a pg_dump flag, and then > you have to make sure you load into something that has the same > behavior? That seems quite error-prone. Having the SET in the dump > seems easier, and it would eliminate the need for E'' in the pg_dump > file. > > What if we do something like SET NO VALIDATOR var='val' and if the SET is > not understood, no error is generated? > > -- > Bruce Momjian http://candle.pha.pa.us > EnterpriseDBhttp://www.enterprisedb.com > > + If your life is a hard drive, Christ can be your backup. + > > ---(end of broadcast)--- > TIP 1: if posting/reading through Usenet, please send an appropriate >subscribe-nomail command to [EMAIL PROTECTED] so that your >message can get through to the mailing list cleanly > -- Bruce Momjian http://candle.pha.pa.us EnterpriseDBhttp://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] PL/pgSQL 'i = i + 1' Syntax
Andrew Dunstan <[EMAIL PROTECTED]> writes: > It ought to be illegal to modify the loop control variable anyway, > IMNSHO - it certainly is in Ada, the distant antecedent of pl/pgsql. Distant ancestors aren't particularly relevant here. What plpgsql tries to be is a ripoff^H^H^H^H^H^Hsincere flattery of Oracle's PL/SQL. If modifying the loop variable is disallowed in PL/SQL, I'm all for disallowing it in plpgsql, otherwise not. Anyone have a recent copy of Oracle to try it on? regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] audit table containing Select statements submitted
Do we have any plans of introducing 'AUTONOMOUS TRANSACTION' like feature? Again, it might not be a part of the standard but it is very helpful in situations like these!!! You can run a trigger with an autonomous transaction attached to it, which guarantees that the work done by trigger persists even though the calling transaction rolls back (potentially a hacker trying to cover his tracks)!!! (http://asktom.oracle.com/~tkyte/autonomous/index.html) Gurjeet. On 5/16/06, Josh Berkus wrote: Doug, > But what if the user calls the access function, sees the data, then > issues a ROLLBACK? The audit record would be rolled back as well (as > Tom pointed out earlier). ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] PL/pgSQL 'i = i + 1' Syntax
On May 16, 2006, at 19:52, Tom Lane wrote: Distant ancestors aren't particularly relevant here. What plpgsql tries to be is a ripoff^H^H^H^H^H^Hsincere flattery of Oracle's PL/SQL. If modifying the loop variable is disallowed in PL/SQL, I'm all for disallowing it in plpgsql, otherwise not. Even if PL/SQL disallows it, why would you not allow it in PL/pgSQL? So that it's easier to migrate from PostgreSQL to Oracle? If you only care about Oracle to PostgreSQL (and who wouldn't?), then it in fact seems desirable for PL/pgSQL to be a superset of PL/SQL. But that's must MYH. Best, David ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] pg_dump and backslash escapes
Bruce Momjian writes: > I have seen no reply to my suggestion below, so I assume it is the way > people want to go for 7.3, 7.4, and 8.0. I'm not particularly for it, if that's what you meant, and certainly not for hacking up old branches that way. For one thing, you can't retroactively cause servers that are already out there to not spit errors for GUC variables they've not heard of; and even if you had such a time-travel machine at hand, it's far from clear that it'd be a good idea. The pg_dump philosophy for cross-version updates is generally that the dump should load if you are willing to ignore errors and press on. Not that there will never be errors. See for example our previous handling of the without_oids business, or search_path, or tablespaces. regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] audit table containing Select statements submitted
"Gurjeet Singh" <[EMAIL PROTECTED]> writes: > Do we have any plans of introducing 'AUTONOMOUS TRANSACTION' like > feature? No. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Compression and on-disk sorting
Andrew Piskorski <[EMAIL PROTECTED]> writes: > The main tricks seem to be: One, EXTREMELY lightweight compression > schemes - basically table lookups designed to be as cpu friendly as > posible. Two, keep the data compressed in RAM as well so that you can > also cache more of the data, and indeed keep it the compressed until > as late in the CPU processing pipeline as possible. > > A corrolary of that is forget compression schemes like gzip - it > reduces data size nicely but is far too slow on the cpu to be > particularly useful in improving overall throughput rates. There are some very fast decompression algorithms: http://www.oberhumer.com/opensource/lzo/ I think most of the mileage from "lookup tables" would be better implemented at a higher level by giving tools to data modellers that let them achieve denser data representations. Things like convenient enum data types, 1-bit boolean data types, short integer data types, etc. -- greg ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] PL/pgSQL 'i = i + 1' Syntax
David Wheeler <[EMAIL PROTECTED]> writes: > Even if PL/SQL disallows it, why would you not allow it in PL/pgSQL? Um ... design sanity and consistency, maybe? Not that I think PL/SQL is any paragon of those virtues, but surely "we'll invent any feature we feel like whether it's sane or not" is not a recipe for a maintainable language. (No, I'm not particularly in favor of the BY feature mentioned upthread, either.) regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] pg_dump and backslash escapes
Tom Lane wrote: > Bruce Momjian writes: > > I have seen no reply to my suggestion below, so I assume it is the way > > people want to go for 7.3, 7.4, and 8.0. > > I'm not particularly for it, if that's what you meant, and certainly not > for hacking up old branches that way. For one thing, you can't > retroactively cause servers that are already out there to not spit > errors for GUC variables they've not heard of; and even if you had such > a time-travel machine at hand, it's far from clear that it'd be a good > idea. > > The pg_dump philosophy for cross-version updates is generally that the > dump should load if you are willing to ignore errors and press on. Not > that there will never be errors. See for example our previous handling > of the without_oids business, or search_path, or tablespaces. So, we should SET the variables and allow people to get the errors on load? And not supress them from the client and server logs? Is that better than suppressing them? -- Bruce Momjian http://candle.pha.pa.us EnterpriseDBhttp://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] PL/pgSQL 'i = i + 1' Syntax
On May 16, 2006, at 20:51, Tom Lane wrote: Um ... design sanity and consistency, maybe? Not that I think PL/SQL is any paragon of those virtues, but surely "we'll invent any feature we feel like whether it's sane or not" is not a recipe for a maintainable language. Yes, sanity is important, I agree. (No, I'm not particularly in favor of the BY feature mentioned upthread, either.) Pity. I thought it was a good idea. Best, David ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Compression and on-disk sorting
Greg Stark <[EMAIL PROTECTED]> writes: > Andrew Piskorski <[EMAIL PROTECTED]> writes: >> A corrolary of that is forget compression schemes like gzip - it >> reduces data size nicely but is far too slow on the cpu to be >> particularly useful in improving overall throughput rates. > There are some very fast decompression algorithms: AFAICS the only sane choice here is to use src/backend/utils/adt/pg_lzcompress.c, on the grounds that (1) it's already in the backend, and (2) data compression in general is such a minefield of patents that we'd be foolish to expose ourselves in more than one direction. Certainly, if you can't prototype a convincing performance win using that algorithm, it's unlikely to be worth anyone's time to look harder. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] PL/pgSQL 'i = i + 1' Syntax
(No, I'm not particularly in favor of the BY feature mentioned upthread, either.) regards, tom lane mmm... and why is that? i mean, many languages have some way to increment the for variable by different values... call it STEP, BY or even i+=number -- regards, Jaime Casanova "Programming today is a race between software engineers striving to build bigger and better idiot-proof programs and the universe trying to produce bigger and better idiots. So far, the universe is winning." Richard Cook ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] PL/pgSQL 'i = i + 1' Syntax
"Jaime Casanova" <[EMAIL PROTECTED]> writes: >> (No, I'm not particularly in favor of the BY feature mentioned upthread, >> either.) > mmm... and why is that? Essentially because it's not in the upstream language. Oracle could come out with the same feature next week, only they use STEP or some other syntax for it, and then we'd have a mess on our hands. If the feature were sufficiently compelling use-wise then I'd be willing to risk that, but it doesn't seem to me to be more than a marginal notational improvement. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] audit table containing Select statements submitted
Some users of PL/Java make use of a non-default connection from within a Trigger in order to do this. In essence, they load the client JDBC package into the backend to let the backend as such become a client. The second connection is then maintained for the lifetime of the first. Perhaps not the most efficient way of doing it but it works. Regards, Thomas Hallgren Gurjeet Singh wrote: Do we have any plans of introducing 'AUTONOMOUS TRANSACTION' like feature? Again, it might not be a part of the standard but it is very helpful in situations like these!!! You can run a trigger with an autonomous transaction attached to it, which guarantees that the work done by trigger persists even though the calling transaction rolls back (potentially a hacker trying to cover his tracks)!!! (http://asktom.oracle.com/~tkyte/autonomous/index.html) Gurjeet. On 5/16/06, Josh Berkus wrote: Doug, > But what if the user calls the access function, sees the data, then > issues a ROLLBACK? The audit record would be rolled back as well (as > Tom pointed out earlier). ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] audit table containing Select statements submitted
Thomas Hallgren <[EMAIL PROTECTED]> writes: > Some users of PL/Java make use of a non-default connection from within a > Trigger in order to do this. In essence, they load the client JDBC package > into > the backend to let the backend as such become a client. The second connection > is then maintained for the lifetime of the first. Perhaps not the most > efficient way of doing it but it works. And you can do the same thing with any of the PL languages that have database drivers like Perl or Python. It might be a little less inefficient using one of them -- and probably a lot less code. You should be aware of the risk of deadlocks if you touch the same resources. Because the database is unaware that your main transaction is waiting for this other session to complete it won't be able to detect any deadlocks that depend on this hidden dependency. -- greg ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match