Re: [HACKERS] HOT and INSERT/DELETE
On 8/9/07, Bruce Momjian [EMAIL PROTECTED] wrote: I was wondering about HOT behavior. Will an INSERT reuse rows no longer visible caused by an UPDATE or DELETE, or if an UPDATE will reuse rows expired by an invisible DELETE? It seems both of these would be possible and useful. As the patch stands today, we don't take any pains to update the FSM information after pruning and defraging the page. So FSM would not consider the page for either INSERT or COLD UPDATE. Of course, the page can still be used for INSERT if relation-rd_targetblock is somehow set to this page. OTOH UPDATE will always reuse the dead space of either expired updated rows or deleted rows or even aborted rows. Thanks, Pavan -- Pavan Deolasee EnterpriseDB http://www.enterprisedb.com
Re: [HACKERS] GUC for default heap fillfactor
On 8/8/07, Simon Riggs [EMAIL PROTECTED] wrote: I see why you think that, but it is really just the first update that is cold. All further updates on that block will be able to re-use the dead row left by the first update. If they can't they will spread out to other blocks where they will eventually have sufficiently reduced contention to allow hot updates. That is mostly true, but may not work if the tuple length changes with each update. The space freed up the previous update may not be enough to accommodate the new tuple. Also we should take the unusable space left by redirected line pointer. My take is that it would do nothing for longer term performance on heavily updated tables and have a negative effect on tables that are seldom updated. Overall, thats a loss, for me. OK. I wish to have a config parameter so that DBA can set the system level default and then override that (if required) for each table. It will also be useful for experimenting with different fillfactor. The only way today is either to recompile your sources or change the parameter in every CREATE TABLE statement. Thanks, Pavan -- Pavan Deolasee EnterpriseDB http://www.enterprisedb.com
Re: [HACKERS] HOT and INSERT/DELETE
Pavan Deolasee wrote: On 8/9/07, Bruce Momjian [EMAIL PROTECTED] wrote: I was wondering about HOT behavior. Will an INSERT reuse rows no longer visible caused by an UPDATE or DELETE, or if an UPDATE will reuse rows expired by an invisible DELETE? It seems both of these would be possible and useful. As the patch stands today, we don't take any pains to update the FSM information after pruning and defraging the page. So FSM would not consider the page for either INSERT or COLD UPDATE. Of course, the page can still be used for INSERT if relation-rd_targetblock is somehow set to this page. OTOH UPDATE will always reuse the dead space of either expired updated rows or deleted rows or even aborted rows. Well, that is very good news. -- Bruce Momjian [EMAIL PROTECTED] http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] default_text_search_config and expression indexes
Oleg Bartunov wrote: Yea, seems more work than is necessary. If we require the configuration to be always supplied, and document that mismatches are a problem, I think we are in good shape. We should agree that all you describe is only for DUMMY users. From authors point of view I dislike your approach to treat text searching as a very limited tool. But I understand that we should preserve people from stupid errors. I want for beginners easy setup and error-prone functionality, but leaving experienced users to develop complex search engines. Can we have separate safe interface for text searching and explicitly recommend it for beginners ? I am glad we are moving this interface discussion forward. It seems Heikki has similar concerns about the interface being error-prone. It would be nice to have a novice and advanced interface, but we would have to document both, and then that is going to be confusing for users. As I see it, specifying the configuration name in every function call is the novice interface, and avoids the most common errors. I can see defaulting the interface name as being an advanced interface, but I don't think it has enough of a feature to be worth documenting and implementing. If we figure out something better in 8.4 we can implement it, but at this point I can't think of any good solution to not specifying the configuration name every time. -- Bruce Momjian [EMAIL PROTECTED] http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] HOT patch, missing things
On 8/8/07, Simon Riggs [EMAIL PROTECTED] wrote: So pruning removes dead hot updated tuples, while defragging will remove dead cold updated tuples and deletes, as well as rearranging space. No, pruning removes all dead tuples, irrespective of whether they are HOT or COLD updated and whether they are heap-only or not. It handles line pointer redirection and marks all dead tuples as ~LP_USED. Defragging just repairs the fragmentation by rearranging LP_USED tuples. It's easily possible that 100% of the cold updates and deletes are removed by defragging because of HOT updates. It would be pointless to trigger a VACUUM when it might find nothing to clear up. Its also possible that the deletes are at one end of the table and the updates at the other. So we really need to keep track of the effects of defragging dead cold updates and deletes, so they can be subtracted from the cold update + deletes. The argument for including HOT updates is weaker, but I'm still thinking that they should eventually cause a VACUUM to take place. A very slowly updated table with spread out updates might bloat a table to 200 times its size, even with HOT, in the worst case. Perhaps we should count each HOT update as 1/5th of a row for autovacuum purposes, or some other discounting factor? I can't see how a table might bloat to 200 times its size assuming only HOT updates and without long running transactions. And even if the table is really bloating that way, it must be because of COLD updates and they are discounted in triggering autovac. Am I missing something ? Thanks, Pavan -- Pavan Deolasee EnterpriseDB http://www.enterprisedb.com
Re: [HACKERS] GUC for default heap fillfactor
On Thu, 2007-08-09 at 12:17 +0530, Pavan Deolasee wrote: OK. I wish to have a config parameter so that DBA can set the system level default and then override that (if required) for each table. It will also be useful for experimenting with different fillfactor. The only way today is either to recompile your sources or change the parameter in every CREATE TABLE statement. Why not just put an extra file into your test script that issues ALTER TABLE statements on appropriate tables? If you run this between CREATE and COPY you won't have to edit a thing. You'll be able to more easily control the tables you want to set, so you'll be in less danger of having a gain on one table wiped out by a loss on another table. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] HOT patch, missing things
On Thu, 2007-08-09 at 12:39 +0530, Pavan Deolasee wrote: No, pruning removes all dead tuples, irrespective of whether they are HOT or COLD updated and whether they are heap-only or not. It handles line pointer redirection and marks all dead tuples as ~LP_USED. Defragging just repairs the fragmentation by rearranging LP_USED tuples. It's easily possible that 100% of the cold updates and deletes are removed by defragging because of HOT updates. It would be pointless to trigger a VACUUM when it might find nothing to clear up. Its also possible that the deletes are at one end of the table and the updates at the other. So we really need to keep track of the effects of defragging dead cold updates and deletes, so they can be subtracted from the cold update + deletes. Whether I got the exact details of frugging depruning correct or not: if a tuple version is removed, then VACUUM doesn't need to remove it later, so any non-VACUUM removal of rows must defer a VACUUM. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Unexpected VACUUM FULL failure
On Wed, 2007-08-08 at 23:23 -0400, Tom Lane wrote: I wrote: ... Since we've whacked the tqual.c logic around recently, the problem might actually lie there... In fact, I bet this is a result of the async-commit patch. The places where vacuum.c bleats HEAP_MOVED_OFF was expected are all places where it is looking at a tuple not marked XMIN_COMMITTED; it expects that after its first pass over the table, *every* tuple is either XMIN_COMMITTED or one that it moved. Async commit changed tqual.c so that tuples that are in fact known committed might not get marked XMIN_COMMITTED right away. The patch tries to prevent this from happening within VACUUM FULL by means of /* * VACUUM FULL assumes that all tuple states are well-known prior to * moving tuples around --- see comment known dead in repair_frag(), * as well as simplifications in tqual.c. So before we start we must * ensure that any asynchronously-committed transactions with changes * against this table have been flushed to disk. It's sufficient to do * this once after we've acquired AccessExclusiveLock. */ XLogAsyncCommitFlush(); but I bet lunch that that's not good enough. I still haven't reproduced it, but I'm thinking that the inexact bookkeeping that we created for clog page LSNs allows tuples to not get marked if the right sort of timing of concurrent transactions happens. Not sure about the best solution for this. Good hunch. I plugged this hole earlier, but on further inspection I can see the plug wasn't wide enough. XLogAsyncCommitFlush() is good enough, but HeapTupleSatisfiesVacuum() still allowed the inexact bookkeeping to sometimes skip hint bit setting, when executed with concurrent transactions touching other tables. ISTM that if we call HeapTupleSatisfiesVacuum() with an additional boolean parameter, force, we can tell VF to always set the hint bits in every case, not just HEAP_MOVED_IN and HEAP_MOVED_OUT. Patch enclosed, but a little crufty. Gotta run now, talk later. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com Index: src/backend/commands/vacuum.c === RCS file: /projects/cvsroot/pgsql/src/backend/commands/vacuum.c,v retrieving revision 1.354 diff -c -r1.354 vacuum.c *** src/backend/commands/vacuum.c 1 Aug 2007 22:45:08 - 1.354 --- src/backend/commands/vacuum.c 9 Aug 2007 07:24:41 - *** *** 1384,1390 tuple.t_len = ItemIdGetLength(itemid); ItemPointerSet((tuple.t_self), blkno, offnum); ! switch (HeapTupleSatisfiesVacuum(tuple.t_data, OldestXmin, buf)) { case HEAPTUPLE_DEAD: tupgone = true; /* we can delete the tuple */ --- 1384,1390 tuple.t_len = ItemIdGetLength(itemid); ItemPointerSet((tuple.t_self), blkno, offnum); ! switch (HeapTupleSatisfiesVacuumFull(tuple.t_data, OldestXmin, buf)) { case HEAPTUPLE_DEAD: tupgone = true; /* we can delete the tuple */ *** *** 1998,2004 break; } /* must check for DEAD or MOVED_IN tuple, too */ ! nextTstatus = HeapTupleSatisfiesVacuum(nextTdata, OldestXmin, nextBuf); if (nextTstatus == HEAPTUPLE_DEAD || --- 1998,2004 break; } /* must check for DEAD or MOVED_IN tuple, too */ ! nextTstatus = HeapTupleSatisfiesVacuumFull(nextTdata, OldestXmin, nextBuf); if (nextTstatus == HEAPTUPLE_DEAD || Index: src/backend/utils/time/tqual.c === RCS file: /projects/cvsroot/pgsql/src/backend/utils/time/tqual.c,v retrieving revision 1.103 diff -c -r1.103 tqual.c *** src/backend/utils/time/tqual.c 1 Aug 2007 22:45:09 - 1.103 --- src/backend/utils/time/tqual.c 9 Aug 2007 07:24:43 - *** *** 77,82 --- 77,85 TransactionId RecentGlobalXmin = InvalidTransactionId; /* local functions */ + static HTSV_Result HeapTupleSatisfiesVacuumInternal(HeapTupleHeader tuple, + TransactionId OldestXmin, Buffer buffer, bool locked); + static bool XidInMVCCSnapshot(TransactionId xid, Snapshot snapshot); *** *** 1045,1054 * deleted by XIDs = OldestXmin are deemed recently dead; they might * still be visible to some open transaction, so we can't remove them, * even if we see that the deleting transaction has committed. */ HTSV_Result ! HeapTupleSatisfiesVacuum(HeapTupleHeader tuple, TransactionId OldestXmin, ! Buffer buffer) { /* * Has inserting transaction committed? --- 1048,1060 * deleted by XIDs = OldestXmin are deemed recently dead; they might * still be visible to some open transaction, so we can't remove them, * even if we see that the deleting transaction has committed. + * + * If the heap we are checking is exclusively locked, we can skip
[HACKERS] Compilation of pg 7.4.17 fails on HP-UX
Hello, I have just tried to compile postgresql 7.4.17 on a HP-UX 11.11 box (PA-RISC) and the compliation fails in gist. make[4]: Entering directory `/home/adrian/postgresql-7.4.17/src/backend/access/gist' gcc -O2 -fno-strict-aliasing -Wall -Wmissing-prototypes -Wmissing-declarations -I../../../../src/include -D_XOPEN_SOURCE_EXTENDED -c -o gist.o gist.c /var/tmp//ccxYASP0.s: Assembler messages: /var/tmp//ccxYASP0.s:1528: Error: Field not properly aligned [8] (52). /var/tmp//ccxYASP0.s:1528: Error: Invalid operands /var/tmp//ccxYASP0.s:1531: Error: Field not properly aligned [8] (44). /var/tmp//ccxYASP0.s:1531: Error: Invalid operands /var/tmp//ccxYASP0.s:1534: Error: Field not properly aligned [8] (60). /var/tmp//ccxYASP0.s:1534: Error: Invalid operands make[4]: *** [gist.o] Error 1 make[4]: Leaving directory `/home/adrian/postgresql-7.4.17/src/backend/access/gist' make[3]: *** [gist-recursive] Error 2 make[3]: Leaving directory `/home/adrian/postgresql-7.4.17/src/backend/access' make[2]: *** [access-recursive] Error 2 make[2]: Leaving directory `/home/adrian/postgresql-7.4.17/src/backend' make[1]: *** [all] Error 2 make[1]: Leaving directory `/home/adrian/postgresql-7.4.17/src' make: *** [all] Error 2 This doesn't happen in HEAD. $ uname -a HP-UX hpdev B.11.11 U 9000/800 822196241 unlimited-user license $ gcc --version gcc (GCC) 3.3.3 Cheers, Adrian Maier ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] HOT patch, missing things
On 8/7/07, Heikki Linnakangas [EMAIL PROTECTED] wrote: There's three things clearly missing in the patch: 1. HOT updates on tables with expression or partial indexes. Hasn't been done yet because it should be pretty straightforward and we've had more important things to do. Though not critical, should be finished before release in my opinion. I started with this. ISTM to support partial indexes, we must check the old and new tuple against partiality match. HOT update is feasible iff - old and new tuples, both match the partiality condition OR - old and new tuples, both don't match the condition In either case, we either had an index entry which can serve for the new tuple OR we did not have an index entry for the old tuple, but neither the new tuple needs it. Of course, we still need to apply all other criteria to finally decide whether to do HOT or COLD update. For functional index, we should apply the function to the old and new tuple and compare the outcome. If the results are same, HOT update is feasible. We still need to think about the best way to do this without any modularity invasion and least possible overhead, but can anybody see any issue with the broader approach ? Thanks, Pavan -- Pavan Deolasee EnterpriseDB http://www.enterprisedb.com
Re: [HACKERS] default_text_search_config and expression indexes
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On Thu, Aug 09, 2007 at 02:36:41AM -0400, Bruce Momjian wrote: Oleg Bartunov wrote: Yea, seems more work than is necessary. If we require the configuration to be always supplied, and document that mismatches are a problem, I think we are in good shape. We should agree that all you describe is only for DUMMY users. From authors point of view I dislike your approach to treat text searching as a very limited tool [...] [...] I am glad we are moving this interface discussion forward. It seems Heikki has similar concerns about the interface being error-prone. It would be nice to have a novice and advanced interface, but we would have to document both, and then that is going to be confusing for users. As I see it, specifying the configuration name in every function call is the novice interface, and avoids the most common errors. I can see defaulting the interface name as being an advanced interface, but I don't think it has enough of a feature to be worth documenting and implementing. If we figure out something better in 8.4 we can implement it, but at this point I can't think of any good solution to not specifying the configuration name every time. Maybe I'm missing something, but it seems to me that the configuration is more attached to a column/index thatn to the whole database. If there's a default in an expression, I'd rather expect this default to be drawn from the index involved than from a global value (like a functional index does now). Regards - -- tomás -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.1 (GNU/Linux) iD8DBQFGuuAoBcgs9XrR2kYRAqiiAJsFL+Iu/b/xYaLza5ozmi839Qh5awCeOp+f SZHKDPUHZ3u99XzLBn2ZKjw= =twEt -END PGP SIGNATURE- ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] comunication protocol
Merlin Moncure wrote: On 8/8/07, Omar Bettin [EMAIL PROTECTED] wrote: Hi, I have installed postgresql to my client as a server for a progam (libpq.dll/VCL based) that I wrote for them. Every is working fine, but I noted some general slowness, compared with an older database system (iAnywhere ADS) due (I think) to the text based communication protocol. you should maybe report a couple of specific things (explain analyze, etc) for analysis and make sure your expectations are reasonable. It is possible simple configuration issues or query changes might be the answer here, then again, maybe not. I think this is not necessary because I have used the same application with two different servers and you can see the speed differences just using it. I know there is the possibility to adopt a ssl compressed connection but I think a pure compressed connections could be better. I think you are looking in the wrong direction here. So, I have studied the postgresql sources and I have tried to implement some compression between the backend and the frontend, using pglz_compress/pglz_decompress on be-secure.c and fe-secure.c. At the moment is working good on a local configuration, got some problems on a remote connection due I think a different way to communicate. AFAIK, the fastest possible way to get data off the server, skipping all data and text processing is to write a SPI routine, and stream the data out locally to the server. I am doing exactly this in a particular problem that requires high performance and I can tell you that SPI is fast. http://developer.postgresql.org/pgdocs/postgres/spi-examples.html That way you will bypass the protocol completely. On my computer, I get roughly 300k records/sec raw read performance using libpq and about 1.3m records sec using a hacked SPI and streaming to disk. This may not be helpful for your application but if you are exploring ways to bypass protocol overhead this is where I would start. Interesting,do you thing that is possible to implement some send() in the interface? By the way, your problem might be the VCL driver you are using to access the database. The highest performance driver I have used (which wraps libpq) is the Zeos library which is very fast. I have tried ZeosLib and for me is unusable (too slow), I use a strong modified PostgresDAC. There are other ways (other than be-secure and fe-secure) with which the backend comunicate with the frontend? And, do you think this solution could speed up something? Once again, I would start by looking at your application and posting here to make sure you are looking at the right bottlenecks (you _suspect_ the protocol is the problem, but is it really?). this means: * explain analyze/queries (w/how fast you think it should be going) * relevant .conf settings * time measurements from the app merlin I just switch form Application1 (IAnywhere Ads) to Application2 (Postgresql) optimizing the VCL (strong modifications to PostgresDAC sources) and sow the results. My application needs to full open some tables and with this protocol is like to download a long text file. omar ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] comunication protocol
Gustavo Tonini wrote: Em Quarta 08 Agosto 2007 12:02, Omar Bettin escreveu: Hi, I have installed postgresql to my client as a server for a progam (libpq.dll/VCL based) that I wrote for them. Borland VCL? What component are you using? Gustavo. I use a strong modified PostgresDAC component. Omar ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] default_text_search_config and expression indexes
On Thu, 9 Aug 2007, [EMAIL PROTECTED] wrote: Maybe I'm missing something, but it seems to me that the configuration is more attached to a column/index thatn to the whole database. If there's a default in an expression, I'd rather expect this default to be drawn from the index involved than from a global value (like a functional index does now). I'm tired to repeat - index itself doesn't know about configuration ! It just index tsvector data type. tsvector in turn can be obtained using various ways: 1. manually 2. to_tsvector tsvector can be stored in a separate attribute or be fully virtual like in expressional index. Moreover, tsvector can be obtained using various configurations depending on your application. Regards, Oleg _ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83 ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] HOT patch, missing things
On 8/9/07, Simon Riggs [EMAIL PROTECTED] wrote: Whether I got the exact details of frugging depruning correct or not: if a tuple version is removed, then VACUUM doesn't need to remove it later, so any non-VACUUM removal of rows must defer a VACUUM. ISTM that you are worried about the cases where a tuple is HOT updated and hence can be pruned/defragged, but only if we revisit the page at a later time. What if we just track the amount of potentially dead space in the relation (somebody had suggested that earlier in the thread) ? Every committed UPDATE/DELETE and aborted UPDATE/INSERT would increment the dead space. Whenever page fragmentation is repaired, either during normal operation or during vacuum, the dead space is reduced by the amount of reclaimed space. Autovacuum triggers whenever the percentage of dead space increases beyond a threshold. We can some fine tuning to track the space consumed by redirect-dead line pointers. Thanks, Pavan -- Pavan Deolasee EnterpriseDB http://www.enterprisedb.com
Re: [HACKERS] HOT patch, missing things
On Thu, 2007-08-09 at 15:46 +0530, Pavan Deolasee wrote: On 8/9/07, Simon Riggs [EMAIL PROTECTED] wrote: Whether I got the exact details of frugging depruning correct or not: if a tuple version is removed, then VACUUM doesn't need to remove it later, so any non-VACUUM removal of rows must defer a VACUUM. ISTM that you are worried about the cases where a tuple is HOT updated and hence can be pruned/defragged, but only if we revisit the page at a later time. What if we just track the amount of potentially dead space in the relation (somebody had suggested that earlier in the thread) ? Every committed UPDATE/DELETE and aborted UPDATE/INSERT would increment the dead space. Whenever page fragmentation is repaired, either during normal operation or during vacuum, the dead space is reduced by the amount of reclaimed space. Autovacuum triggers whenever the percentage of dead space increases beyond a threshold. We can some fine tuning to track the space consumed by redirect-dead line pointers. Sounds great. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Function structure in formatting.c
On 8/9/07, Jaime Casanova [EMAIL PROTECTED] wrote: take your time, this seems like it will be for 8.4 anyway I hear you, unfortunately taking my time usually means I forget about it for eight months and by the time I come back to it I've forgotten what I was doing =) I wasn't really expecting this to make it into 8.3. I just need to get it done so I can free up the headspace for other projects. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] comunication protocol
On 8/9/07, Omar Bettin [EMAIL PROTECTED] wrote: Merlin Moncure wrote: AFAIK, the fastest possible way to get data off the server, skipping all data and text processing is to write a SPI routine, and stream the data out locally to the server. I am doing exactly this in a particular problem that requires high performance and I can tell you that SPI is fast. Interesting,do you thing that is possible to implement some send() in the interface? SPI is an interface which allows you to make sql calls from C code (PostgreSQL allows you to link C code compiled as a .so to the server and call -- see numerous examples in contrib). The routine you need to exploit is SPI_getbinval which gives you Datum (essentially a variant) pointing to the internal binary representation of your field. In theory you could collect the data into a buffer and send() it off although thats a lot of work IMO. Also, I would only advise this for fast dumps from a single table (no joins, etc). I have tried ZeosLib and for me is unusable (too slow), I use a strong modified PostgresDAC. I'm suprised -- I know the ZeosLib internals and it's a very thin layer over libpq. Here is what I suggest: * turn on statement logging in the server (set log_min_duration_statement) * make some operations in the app which you suggest are slow -- they will show up in the log * 'explain analyze' the query from the psql console make note of the times and post back (maybe move this thread to the -performance list) merlin ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] HOT patch, missing things
Pavan Deolasee [EMAIL PROTECTED] writes: HOT update is feasible iff - old and new tuples, both match the partiality condition OR - old and new tuples, both don't match the condition ... For functional index, we should apply the function to the old and new tuple and compare the outcome. If the results are same, HOT update is feasible. This is debatable. We could compare the columns used in the partial condition expression or function expression directly. If they're the same then the expression or function must return the same value. If the function is quite expensive then that might be cheaper. On the other hand if it's not expensive and the columns change frequently but the results don't then we might be doing a lot of work for nothing. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[HACKERS] crypting prosrc in pg_proc
hello everybody, one of our customers wants to store the code of interpreted procedures (PL/pgSQL, PL/Perl) and so in an encrypted way. so the idea we had to add one more column to pg_proc telling us whether prosrc is encrypted or not. people could chose then whether to crypt codes there or not (speed of decryption can be an issue). should not be hard to implement ... what do people think about this feature? many thanks, hans -- Cybertec Geschwinde Schönig GmbH Gröhrmühlgasse 26, 2700 Wiener Neustadt Tel: +43/1/205 10 35 / 340 www.postgresql.at, www.cybertec.at
Re: [HACKERS] crypting prosrc in pg_proc
On Thu, Aug 09, 2007 at 03:39:06PM +0200, Hans-Juergen Schoenig wrote: so the idea we had to add one more column to pg_proc telling us whether prosrc is encrypted or not. people could chose then whether to crypt codes there or not (speed of decryption can be an issue). should not be hard to implement ... what do people think about this feature? how would you decrypt the source before using the function? depesz -- quicksil1er: postgres is excellent, but like any DB it requires a highly paid DBA. here's my CV! :) http://www.depesz.com/ - blog dla ciebie (i moje CV) ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] crypting prosrc in pg_proc
Hans-Juergen Schoenig wrote: hello everybody, one of our customers wants to store the code of interpreted procedures (PL/pgSQL, PL/Perl) and so in an encrypted way. so the idea we had to add one more column to pg_proc telling us whether prosrc is encrypted or not. people could chose then whether to crypt codes there or not (speed of decryption can be an issue). should not be hard to implement ... what do people think about this feature? Perhaps you could give us a justification for it. Are you intending to have stored procs contain security sensitive information? Or is this an attempt to hide closed source code from prying eyes? Where would the encryption keys be stored? And how would it work with pg_dump? This doesn't sound very well thought out, frankly. cheers andrew ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] crypting prosrc in pg_proc
the idea is basically to hide codes - many companies want that and ask for it again and again. i would suggest keys to reside in $PGDATA. we do this for SSL and so already. initdb could create such keys so that they are unique to every database instance. decrypting could be avoided as much as possible basically we should just decrypt on first all and when it changes. for pg_dump i would suggest two options: a.) pass the keys to dump in a decrypted way b.) dump in encrypted way. i would think that this is a quite valuable features. would be nice to have it. maybe we can agree on a nice mechanism here which will be implemented then. hans On Aug 9, 2007, at 3:57 PM, Andrew Dunstan wrote: Hans-Juergen Schoenig wrote: hello everybody, one of our customers wants to store the code of interpreted procedures (PL/pgSQL, PL/Perl) and so in an encrypted way. so the idea we had to add one more column to pg_proc telling us whether prosrc is encrypted or not. people could chose then whether to crypt codes there or not (speed of decryption can be an issue). should not be hard to implement ... what do people think about this feature? Perhaps you could give us a justification for it. Are you intending to have stored procs contain security sensitive information? Or is this an attempt to hide closed source code from prying eyes? Where would the encryption keys be stored? And how would it work with pg_dump? This doesn't sound very well thought out, frankly. cheers andrew ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate -- Cybertec Geschwinde Schönig GmbH Gröhrmühlgasse 26, 2700 Wiener Neustadt Tel: +43/1/205 10 35 / 340 www.postgresql.at, www.cybertec.at
Re: [HACKERS] crypting prosrc in pg_proc
the idea is basically to hide codes - many companies want that and ask for it again and again. Hide code from who (or is that whom?)? The PL compiler(s) will need to decrypt the code. If a compiler can decrypt it, then anyone can decrypt it (because the compilers are open-source). And the problem is that any user that can run a function must be able to compile that function, and therefore, any user that can run a function must have the decryption key for that function. So, I'm not sure you've secured the source code from any user that can run the function. Of course, if your goal is to hide the code from someone snooping through the pg_proc relation (on disk), then encryption will certainly help (provided the key is properly protected). Note: I think source-code encryption is a useful idea, it just seems tricky to implement in an open environment. -- Korry ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] HOT patch, missing things
On Thu, Aug 09, 2007 at 01:25:14PM +0100, Gregory Stark wrote: Pavan Deolasee [EMAIL PROTECTED] writes: HOT update is feasible iff - old and new tuples, both match the partiality condition OR - old and new tuples, both don't match the condition ... For functional index, we should apply the function to the old and new tuple and compare the outcome. If the results are same, HOT update is feasible. This is debatable. We could compare the columns used in the partial condition expression or function expression directly. If they're the same then the expression or function must return the same value. If the function is quite expensive then that might be cheaper. On the other hand if it's not expensive and the columns change frequently but the results don't then we might be doing a lot of work for nothing. If we're going to get this into 8.3 I think we should be leaning towards whatever is the simplest way to do it... -- Decibel!, aka Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) pgpPF8E4Vwj64.pgp Description: PGP signature
Re: [HACKERS] GUC for default heap fillfactor
On Thu, Aug 09, 2007 at 09:57:48AM +0900, ITAGAKI Takahiro wrote: If HOT gets into 8.3, we might need a GUC to set database wide heap fillfactor to an appropriate value. I have no objection to do that, but we will need other default values soon, something like default_[heap|btree|hash|gist|gin]_fillfactor. Some of us might feel it is mess to add random guc variables. I think we'd want to handle indexes with a different mechanism, probably one that makes changes to pg_am. In any case, how important is it to do this before 8.3? We were supposed to release this month, after all. -- Decibel!, aka Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) pgpkDxwB7dNwA.pgp Description: PGP signature
Re: [HACKERS] crypting prosrc in pg_proc
On Aug 9, 2007, at 4:47 PM, korry.douglas wrote: the idea is basically to hide codes - many companies want that and ask for it again and again. Hide code from who (or is that whom?)? the code should be hidden from the guy who is actually executing the function. so: some user is doing: select func(); the backend loads the keys from PGDATA, decrypts the codes executes them. as a normal user cannot look into the backend the code is safe. the keys are only visible to the sysadmis but not at SQL level. The PL compiler(s) will need to decrypt the code. no, the backend will pass the decrypted codes to the call handler. there is no need for Perl, Python or so to be aware of this issue. If a compiler can decrypt it, then anyone can decrypt it (because the compilers are open-source). And the problem is that any user that can run a function must be able to compile that function, and therefore, any user that can run a function must have the decryption key for that function. So, I'm not sure you've secured the source code from any user that can run the function. why that? the backend is doing the job. the user does not pass the keys. it is a database internal thing. the only idea is to make sure that pg_proc does not contain user readable code. Of course, if your goal is to hide the code from someone snooping through the pg_proc relation (on disk), then encryption will certainly help (provided the key is properly protected). no, somebody who has access to the filesystem is not relevant. just think of shipping some database inside some router or inside some other stuff. the vendor just wants to make sure that other people don't fully understand the magic going on. hans -- Cybertec Geschwinde Schönig GmbH Gröhrmühlgasse 26, 2700 Wiener Neustadt Tel: +43/1/205 10 35 / 340 www.postgresql.at, www.cybertec.at
Re: [HACKERS] crypting prosrc in pg_proc
On Aug 9, 2007, at 4:34 PM, Peter Eisentraut wrote: Am Donnerstag, 9. August 2007 16:09 schrieb Hans-Juergen Schoenig: the idea is basically to hide codes - many companies want that and ask for it again and again. If you want to design a security feature, you need to offer a threat and risk analysis, not just the whining of customers. -- Peter Eisentraut http://developer.postgresql.org/~petere/ well, the complete analysis is easy - the solution is not. currently we have basically no option to reduce access to the system tables. this would be hard anyway as we need those tables for basically all kinds of operations. the problem here is that vendors of appliances don't want people to spider their codes. this is a fact - it is not the idea of open source to do so but bloody reality. in addition to that people are not willing to code everything in C just to hide. so, there has to be a concept to achieve this for stored procedures somehow. i am afraid the source level encryption is the easiest thing and most understandable thing to do. so, better ideas are welcome. hans -- Cybertec Geschwinde Schönig GmbH Gröhrmühlgasse 26, 2700 Wiener Neustadt Tel: +43/1/205 10 35 / 340 www.postgresql.at, www.cybertec.at
Re: [HACKERS] crypting prosrc in pg_proc
Wouldn't it be better just to hide prosrc from the users you want to hide it from? (Note: my SOC student is working on column level privs, although unfortunately his work doesn't currently work with SELECT privs). In fact, in many cases, depending on the app, you can simply remove SELECT priv from public on pg_proc. There are also some fairly impressive code obfuscators about, that your clients might find useful. Of course, debugging the obfuscated code is a nightmare, but that's their affair if they want to use it ;-) cheers andrew Hans-Juergen Schoenig wrote: the idea is basically to hide codes - many companies want that and ask for it again and again. i would suggest keys to reside in $PGDATA. we do this for SSL and so already. initdb could create such keys so that they are unique to every database instance. decrypting could be avoided as much as possible basically we should just decrypt on first all and when it changes. for pg_dump i would suggest two options: a.) pass the keys to dump in a decrypted way b.) dump in encrypted way. i would think that this is a quite valuable features. would be nice to have it. maybe we can agree on a nice mechanism here which will be implemented then. hans On Aug 9, 2007, at 3:57 PM, Andrew Dunstan wrote: Hans-Juergen Schoenig wrote: hello everybody, one of our customers wants to store the code of interpreted procedures (PL/pgSQL, PL/Perl) and so in an encrypted way. so the idea we had to add one more column to pg_proc telling us whether prosrc is encrypted or not. people could chose then whether to crypt codes there or not (speed of decryption can be an issue). should not be hard to implement ... what do people think about this feature? Perhaps you could give us a justification for it. Are you intending to have stored procs contain security sensitive information? Or is this an attempt to hide closed source code from prying eyes? Where would the encryption keys be stored? And how would it work with pg_dump? This doesn't sound very well thought out, frankly. cheers andrew ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate -- Cybertec Geschwinde Schönig GmbH Gröhrmühlgasse 26, 2700 Wiener Neustadt Tel: +43/1/205 10 35 / 340 www.postgresql.at, www.cybertec.at http://www.cybertec.at ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] crypting prosrc in pg_proc
Am Donnerstag, 9. August 2007 16:09 schrieb Hans-Juergen Schoenig: the idea is basically to hide codes - many companies want that and ask for it again and again. If you want to design a security feature, you need to offer a threat and risk analysis, not just the whining of customers. -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] crypting prosrc in pg_proc
Hans-Juergen Schoenig napsal(a): the idea is basically to hide codes - many companies want that and ask for it again and again. i would suggest keys to reside in $PGDATA. we do this for SSL and so already. initdb could create such keys so that they are unique to every database instance. decrypting could be avoided as much as possible basically we should just decrypt on first all and when it changes. But, Companies want to hide code also because they distribute their software. If you store key somewhere on server, user will be able to decrypt the original code. If I remember correctly Oracle wrap generates something like bytecode and each Oracle installation is able to understand them. But It is not possible decode it back to original form. My suggestion is to extend PL API and each PL language should offer wrap or encrypt function which generate encrypted code and this code will be store in the pg_proc. PL language will be responsible to detect if it raw or crypted code. PG_Dump will dump crypted procedure and author is responsible keep his uncrypted version in source repository. Zdenek ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] crypting prosrc in pg_proc
On Thu, 9 Aug 2007, Andrew Dunstan wrote: There are also some fairly impressive code obfuscators about, that your clients might find useful. All they really need is to find a sufficiently clever PL/Perl programmer. -- * Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] crypting prosrc in pg_proc
On Thu, Aug 09, 2007 at 04:34:48PM +0200, Peter Eisentraut wrote: Am Donnerstag, 9. August 2007 16:09 schrieb Hans-Juergen Schoenig: the idea is basically to hide codes - many companies want that and ? ask for it again and again. If you want to design a security feature, you need to offer a threat and risk analysis, not just the whining of customers. This isn't as much a security feature as a means for people to keep their code private. To some degree you could just get away with storing nothing but a parsed representation of the code, though of course someone could always decompile that. There's a non-trivial amount of work involved in handling key management, etc, so I think we don't want to try and build that in. What would be interesting is providing hooks for en/decrypting function code and having the backend call those hooks as appropriate. That should allow someone to develop the ability to encrypt the code in the database outside of the backend. -- Decibel!, aka Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) pgpZN0BTyzThA.pgp Description: PGP signature
Re: [HACKERS] crypting prosrc in pg_proc
On Aug 9, 2007, at 5:03 PM, Greg Smith wrote: On Thu, 9 Aug 2007, Andrew Dunstan wrote: There are also some fairly impressive code obfuscators about, that your clients might find useful. All they really need is to find a sufficiently clever PL/Perl programmer. we should make this a PL/Brainfuck implementation then ;) hans -- Cybertec Geschwinde Schönig GmbH Gröhrmühlgasse 26, 2700 Wiener Neustadt Tel: +43/1/205 10 35 / 340 www.postgresql.at, www.cybertec.at
Re: [HACKERS] comunication protocol
Merlin Moncure wrote: On 8/9/07, Omar Bettin [EMAIL PROTECTED] wrote: Merlin Moncure wrote: AFAIK, the fastest possible way to get data off the server, skipping all data and text processing is to write a SPI routine, and stream the data out locally to the server. I am doing exactly this in a particular problem that requires high performance and I can tell you that SPI is fast. Interesting,do you thing that is possible to implement some send() in the interface? SPI is an interface which allows you to make sql calls from C code (PostgreSQL allows you to link C code compiled as a .so to the server and call -- see numerous examples in contrib). The routine you need to exploit is SPI_getbinval which gives you Datum (essentially a variant) pointing to the internal binary representation of your field. In theory you could collect the data into a buffer and send() it off although thats a lot of work IMO. Also, I would only advise this for fast dumps from a single table (no joins, etc). why not joins? I have tried ZeosLib and for me is unusable (too slow), I use a strong modified PostgresDAC. I'm suprised -- I know the ZeosLib internals and it's a very thin layer over libpq. Here is what I suggest: * turn on statement logging in the server (set log_min_duration_statement) * make some operations in the app which you suggest are slow -- they will show up in the log * 'explain analyze' the query from the psql console make note of the times and post back (maybe move this thread to the -performance list) merlin So, you aren't agree with the compression... I have sow the network statistics and in some cases, the network traffic is very big. Probably a better setup could increase the performance by a few points percent but I think a compressed protocol could increase the communication by 2/3 times. I think the biggest bottleneck in the whole system is just that. :..try to get a query from a remote server with a 56k modem! :) Omar ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] crypting prosrc in pg_proc
My suggestion is to extend PL API and each PL language should offer wrap or encrypt function which generate encrypted code and this code will be store in the pg_proc. PL language will be responsible to detect if it raw or crypted code. PG_Dump will dump crypted procedure and author is responsible keep his uncrypted version in source repository. That strategy assumes that there is no way to reproduce the source form from the encrypted form. That means that the PL compiler must be able to recognize two different languages (the unencrypted form and the encrypted form). That may work for PL/pgSQL but it won't work for any language where a third-party compiler is involved (unless the decryption function produces an obfuscated source form that represents legitimate code for the compiler in question). The basic problem is this: if you have to decrypt the code in order to give it to a compiler (PL/pgSQL, Java, Perl, ...) then there is a point in time where the source code is in plaintext form - it would be trivial to add an fprintf( stderr, %s, plainTextForm ) to the PL handler to steal the code. -- Korry ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[HACKERS] change name of redirect_stderr?
Before I wrap up the CSVlog stuff, we need to decide whether or not to change the name of the redirect_stderr setting, and if so to what. The reason is that with CSVlogs it will no longer apply just to stderr (we will require it to be on for CSVlogs, in fact). I suggest redirect_logs, although it's arguably too general as it doesn't apply to syslog/eventlog. But maybe that doesn't matter, as we can note it in the docs and the sample conf file. thoughts? cheers andrew ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] crypting prosrc in pg_proc
The basic problem is this: if you have to decrypt the code in order to give it to a compiler (PL/pgSQL, Java, Perl, ...) then there is a point in time where the source code is in plaintext form - it would be trivial to add an fprintf( stderr, %s, plainTextForm ) to the PL handler to steal the code. -- Korry if somebody is allowed to recompile on the box GRANT and REVOKE are all useful. you did not get the point, i think. we are shipping appliances - users can only use SQL; no compilers and no SSH logins allowed ... hans -- Cybertec Geschwinde Schönig GmbH Gröhrmühlgasse 26, 2700 Wiener Neustadt Tel: +43/1/205 10 35 / 340 www.postgresql.at, www.cybertec.at
Re: [HACKERS] crypting prosrc in pg_proc
On Thu, Aug 09, 2007 at 04:50:53PM +0200, Zdenek Kotala wrote: Hans-Juergen Schoenig napsal(a): the idea is basically to hide codes - many companies want that and ask for it again and again. i would suggest keys to reside in $PGDATA. we do this for SSL and so already. initdb could create such keys so that they are unique to every database instance. decrypting could be avoided as much as possible basically we should just decrypt on first all and when it changes. But, Companies want to hide code also because they distribute their software. If you store key somewhere on server, user will be able to decrypt the original code. If folks keep that argument up, we'll never, ever get this useful feature. I don't care *what* security you implement (maybe short of quantum physics), given enough time and motivation, someone can break it. That's not the point here... the point is just to make it more difficult for someone to obtain the source code for procedures. Right now, anyone who can connect to the database can see *everything*. This is also related to the desire to be able to restrict access to the catalog tables. Doing so could potentially solve this problem; it also solves other issues (such as being able to see all the databases that exist on a server, something that hosting environments care about). -- Decibel!, aka Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) pgpY53aG0Hgml.pgp Description: PGP signature
Re: [HACKERS] crypting prosrc in pg_proc
On Thu, Aug 09, 2007 at 04:42:19PM +0200, Hans-Juergen Schoenig wrote: the problem here is that vendors of appliances don't want people to spider their codes. this is a fact - it is not the idea of open source to do so but bloody reality. in addition to that people are not willing to code everything in C just to hide. Well, then, they're out of luck. so, there has to be a concept to achieve this for stored procedures somehow. Just because you want something does not mean that there must be a way to achieve it. i am afraid the source level encryption is the easiest thing and most understandable thing to do. It's easy and understandable, but it won't do anything for you, because the code has to be decrypted automatically in order to run. If it can be decrypted by the system, then it can obviously be decrypted by someone who has superuser access to the system too. I have seen systems that attempt this sort of protection anyway. (Indeed, one might argue that all of the hopeless and failed copy protection systems CDs and DVDs have come with are exactly this sort of trick.) The fundamental fact is that, if you want users to be able to use code that is interpreted, then you're never going to be able to guarantee that the users can't also see that code. Other equally useless, in my experience, techniques include code obfuscation. You could probably bind things up in enough levels of nested calls to various functions that mapping it out would be tricky and tedious. Of course, this also means that maintaining the code is a fairly significant pain. I'll also note that I am not actually convinced these contortions do anything to protect the system from people who want to figure out how it works. None of that, of course, does anything to relieve the pressure on you to deliver the requested feature. But perhaps you can reason with them. And they can always write it in C and deliver only compiled object code; although whether anyone would be foolish enough to run such object code without inspecting its safety is an excercise in speculation I don't care to pursue. A -- Andrew Sullivan | [EMAIL PROTECTED] The plural of anecdote is not data. --Roger Brinner ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] crypting prosrc in pg_proc
On Thu, Aug 09, 2007 at 05:16:43PM +0200, Hans-Juergen Schoenig wrote: we are shipping appliances - users can only use SQL; no compilers and no SSH logins allowed ... If you have shipped a physical thing to someone else, then they can do what they want with it irrespective of what rules you think you have imposed. If you believe otherwise, then you are deluding yourself about the security of your approach. It was this sort of magical thinking that caused SIM cards, WEP keys, DVD copy protection, and other such guaranteed safe technologies to be revealed as encumbered with needless weight that may foil casual users, but that are trivially broken by anyone actually interested in doing the breaking. A -- Andrew Sullivan | [EMAIL PROTECTED] In the future this spectacle of the middle classes shocking the avant- garde will probably become the textbook definition of Postmodernism. --Brad Holland ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] crypting prosrc in pg_proc
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Andrew Sullivan wrote: On Thu, Aug 09, 2007 at 04:42:19PM +0200, Hans-Juergen Schoenig wrote: the problem here is that vendors of appliances don't want people to spider their codes. this is a fact - it is not the idea of open source to do so but bloody reality. in addition to that people are not willing to code everything in C just to hide. Well, then, they're out of luck. This whole thread can be summed up as: plpgsql is an interpretive language, not a compiled on. You can and do see the source, just like perl, ruby, python, bash etc... If you don't like it, don't use it and use C instead. :) Keep in mind that the obfuscation techniques that can be used by python and perl are all reversible. Sincerely, Joshua D. Drake - -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGuzMUATb/zqfZUUQRAn0uAJ9evWtJkzS5hMiJDNSi+1jxMobdoACghm3D r7Tmmz801bKntXAipJBrvy4= =yRz/ -END PGP SIGNATURE- ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] crypting prosrc in pg_proc
Andrew Sullivan wrote: And they can always write it in C and deliver only compiled object code; although whether anyone would be foolish enough to run such object code without inspecting its safety is an excercise in speculation I don't care to pursue. The intended use for this is an appliance, not a general purpose server. They wouldn't have any option :-) cheers andrew ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] crypting prosrc in pg_proc
Decibel! wrote: This is also related to the desire to be able to restrict access to the catalog tables. Doing so could potentially solve this problem; it solves other issues (such as being able to see all the databases that exist on a server, something that hosting environments care about). You can hide the catalogs, albeit at the cost of some functionality. I did some experimentation a couple of years back with removing public access from the catalogs, removing information_schema and the public schema, etc, and it worked quite well. I set up a user who had access to a single schema, which only contained functions, and the user wasn't able (so far as I could determine) to see anything other than those functions - no tables, no catalogs, no databases, no users. The user was still able to function exactly as intended. The intended scenario was for a web app user, where the web server was subverted, the aim being to restrict the amount of information the intruder could steal. That doesn't help with information leaking in shared hosting setups, I agree. cheers andrew ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Compilation of pg 7.4.17 fails on HP-UX
Adrian Maier [EMAIL PROTECTED] writes: I have just tried to compile postgresql 7.4.17 on a HP-UX 11.11 box (PA-RISC) and the compliation fails in gist. I do not think anyone cares about making 7.4.x run on platforms it did not support before. Use a newer PG release. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] HOT patch, missing things
Pavan Deolasee [EMAIL PROTECTED] writes: I started with this. ISTM to support partial indexes, we must check the old and new tuple against partiality match. ... For functional index, we should apply the function to the old and new tuple and compare the outcome. If the results are same, HOT update is feasible. I don't like either of these. They are going to be extremely expensive if the function or predicate is expensive (because you're going to be doing two evaluations that you might get no benefit from). Also, if the function is not as immutable as it's supposed to be, you will soon have an utterly corrupt index, with entries pointing at rows they in fact don't match and never did. We have so far managed to avoid any really strong dependencies on the requirement of index-function immutability --- your queries may not work very well if the function isn't immutable, but you are not at risk of system-level data corruption. With this, you will be. Since we are entirely dependent on users to mark immutable functions correctly (and have not always gotten it right ourselves :-(), I don't think this is an acceptable risk. If we can't do better than that (and offhand I don't see how to), then I agree with the current approach of disabling HOT when functional or partial indexes are present. regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] crypting prosrc in pg_proc
On 8/9/07, Andrew Dunstan [EMAIL PROTECTED] wrote: Decibel! wrote: This is also related to the desire to be able to restrict access to the catalog tables. Doing so could potentially solve this problem; it solves other issues (such as being able to see all the databases that exist on a server, something that hosting environments care about). You can hide the catalogs, albeit at the cost of some functionality. I did some experimentation a couple of years back with removing public access from the catalogs, removing information_schema and the public schema, etc, and it worked quite well. I set up a user who had access to a single schema, which only contained functions, and the user wasn't able (so far as I could determine) to see anything other than those functions - no tables, no catalogs, no databases, no users. The user was still able to function exactly as intended. The intended scenario was for a web app user, where the web server was subverted, the aim being to restrict the amount of information the intruder could steal. This works very well to stop casual browsing of functions from psql, etc. That said, I am in the camp that securing system catalogs (including pg_proc) is a good and necessary feature. This debate came up a while back with all the usual arguments pro- and con-. IIRC the general conclusion was that if you want to truly encrypt the sources for your functions, the basic idea is to create a new stored procedure language that wraps pl/pgsql and handles encryption there. This would be relatively easy to support as an external module, I think. merlin ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Problem with locks
Gregory Stark [EMAIL PROTECTED] writes: Tom Lane [EMAIL PROTECTED] writes: Gregory Stark [EMAIL PROTECTED] writes: We're seeing a problem where occasionally a process appears to be granted a lock but miss its semaphore signal. Kernel bug maybe? What's the platform? It does sound like it given the way my description went. I was worried it may be some code path not setting waitStatus properly or the compiler caching it incorrectly somehow. But now that I check I see it's a pretty old kernel version (Linux 2.6.5) For what it's worth we've reproduced the problem with 2.6.16.21 which is only about a year old. I want to rerun this with a shiny new 2.6.22 kernel but really 2.6.16 is recent enough that I don't know of any major bugs fixed in IPC handling since then (with the exception of hugetlb interaction which we're not using on this machine) . So now this is probably either an ongoing kernel bug affecting Postgres or it's elsewhere -- either in Postgres or GCC. I'm really concerned about this because while the behaviour with deadlock_timeout set quite high (we have it set to 60s on this machine) is bad enough -- the behaviour with it set to the default 1s is far more scary. On the default 1s timeout on a machine undergoing lock waits which are mostly under 1s you will probably never notice anything recognizably similar to this. You'll occasionally have some lock waits which last a second for no good reason but you'll never notice that. *But* if you should have a lock wait which lasts more than 1s before it's granted, then when it's granted the semaphore gets lost you're in serious doo doo. The deadlock timeout only fires once and then nothing's going to wake up that process ever again. IIRC we've actually gotten a couple reports of people claiming they've got a deadlock when there was no evidence of a deadlock in pg_locks. We always chalked it down to a single long-lived process holding the lock and blocking, but never did much analysis on those reports to see if that was really the case. It's quite possible we had users already observing this problem. If it's a real problem then we're in a bit of a bind. Even if we find and fix a Linux kernel problem we'll still have users on versions of the kernel prior to 2.6.23 or whatever has the bug fixed. We may be best off including an option to have the deadlock timer refire every deadlock_timeout interval instead of just firing once. Then we could print a message any time it occurs and include a HINT about upgrading to a kernel with the bug fixed. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
[HACKERS] createlang/droplang -l outputs
Hi, The header in the list of already installed languages shown by createlang/droplang with the -l option is not printed from 8.2. PostgreSQL 8.2: $ createlang -l plpgsql | yes PostgreSQL 8.1: $ createlang -l Procedural Languages Name | Trusted? -+-- plpgsql | yes The structure member 'start_table' is added to the 'printTableOpt' in the 'src/bin/psql/print.h' from 8.2. But the value of the 'start_table' is not set to true. Regards, Index: src/bin/scripts/createlang.c === RCS file: /projects/cvsroot/pgsql/src/bin/scripts/createlang.c,v retrieving revision 1.24 diff -u -r1.24 createlang.c --- src/bin/scripts/createlang.c14 Jul 2006 14:52:27 - 1.24 +++ src/bin/scripts/createlang.c9 Aug 2007 14:46:31 - @@ -138,6 +138,7 @@ memset(popt, 0, sizeof(popt)); popt.topt.format = PRINT_ALIGNED; popt.topt.border = 1; + popt.topt.start_table = true; popt.topt.encoding = PQclientEncoding(conn); popt.title = _(Procedural Languages); printQuery(result, popt, stdout, NULL); Index: src/bin/scripts/droplang.c === RCS file: /projects/cvsroot/pgsql/src/bin/scripts/droplang.c,v retrieving revision 1.21 diff -u -r1.21 droplang.c --- src/bin/scripts/droplang.c 31 May 2006 11:02:42 - 1.21 +++ src/bin/scripts/droplang.c 9 Aug 2007 14:46:31 - @@ -149,6 +149,7 @@ memset(popt, 0, sizeof(popt)); popt.topt.format = PRINT_ALIGNED; popt.topt.border = 1; + popt.topt.start_table = true; popt.topt.encoding = PQclientEncoding(conn); popt.title = _(Procedural Languages); printQuery(result, popt, stdout, NULL); Tomoaki Sato [EMAIL PROTECTED] SRA OSS, Inc. Japan ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] HOT patch, missing things
Tom Lane [EMAIL PROTECTED] writes: We have so far managed to avoid any really strong dependencies on the requirement of index-function immutability --- your queries may not work very well if the function isn't immutable, but you are not at risk of system-level data corruption. With this, you will be. Wait, how would this be any more vulnerable to system-level data corruption than a normal update? The worst case I can see is that you have a properly updated tuple but the new tuple version is indexed incorrectly just as would be the case if you have a functional index or expression index which had changed value since the update was performed. I agree about the costs for evaluating the expressions. But a COLD update is certainly going to have to evaluate both expressions once. The only additional cost here is that HOT is going to have to evaluate the *old* expression as well. So it's at worst twice as expensive as a normal COLD update. I think I'm leaning towards doing a binary comparison of the parameters to the expressions. That won't catch as many cases as comparing the results of the expressions -- and I can think of cases where that would be disappointing -- but it's in keeping with how it determines whether a tuple is eligible for a HOT update in the first place. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] HOT patch, missing things
Gregory Stark [EMAIL PROTECTED] writes: I agree about the costs for evaluating the expressions. But a COLD update is certainly going to have to evaluate both expressions once. The only additional cost here is that HOT is going to have to evaluate the *old* expression as well. So it's at worst twice as expensive as a normal COLD update. What's bothering me is the case where we evaluate the expression twice, find it doesn't match, and fall through to the COLD update logic which will do it a third time. I think I'm leaning towards doing a binary comparison of the parameters to the expressions. Yeah, we could simply insist on no change to any column that's used by any of the expressions. That would be cheap to test. regards, tom lane ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] change name of redirect_stderr?
Andrew Dunstan [EMAIL PROTECTED] writes: Before I wrap up the CSVlog stuff, we need to decide whether or not to change the name of the redirect_stderr setting, and if so to what. The reason is that with CSVlogs it will no longer apply just to stderr (we will require it to be on for CSVlogs, in fact). I suggest redirect_logs, although it's arguably too general as it doesn't apply to syslog/eventlog. Perhaps it should be named analogously to stats_start_collector, ie think of the syslogger process as a log collector. I don't much like log_start_collector though --- start_log_collector seems far less confusing as to where the verb is. No strong opinion here, just tossing out some ideas. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] crypting prosrc in pg_proc
Merlin Moncure [EMAIL PROTECTED] writes: ... IIRC the general conclusion was that if you want to truly encrypt the sources for your functions, the basic idea is to create a new stored procedure language that wraps pl/pgsql and handles encryption there. This would be relatively easy to support as an external module, I think. Yeah, I was about to make that same suggestion: it's not clear to me that you need any support whatsoever from the core project for this. Make a loadable PL plpgsqlsec or something like that that's just a thin wrapper around the plpgsql call handler, and all it does is decrypt the source text. It's not clear exactly where the *en*cryption should happen, but one possibility is to have the new PL have a validator function that does an immediate UPDATE on the pg_proc row if the text isn't already encrypted. The encrypted representation would need to work like MD5 passwords: it's ASCII so that dumps will work, and it's possible to tell whether a given string is encrypted or not. Given the very limited use-case for what's being suggested here, I don't feel a need to put it in core. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] default_text_search_config and expression indexes
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On Thu, Aug 09, 2007 at 02:03:13PM +0400, Oleg Bartunov wrote: On Thu, 9 Aug 2007, [EMAIL PROTECTED] wrote: Maybe I'm missing something [...] I'm tired to repeat - index itself doesn't know about configuration ! It just index tsvector data type. tsvector in turn can be obtained using various ways: 1. manually 2. to_tsvector tsvector can be stored in a separate attribute or be fully virtual like in expressional index. Moreover, tsvector can be obtained using various configurations depending on your application. Yep. I think I got that. Now what will be the normal case for a new user? * build an index with a virtual tsvector (using a configuration). Then I would expect the index to know the whole function to calculate its entries -- that would include the config used. This would be more compatible with the approach stated elsewhere to always mention explicitly the config. * manually. Would a novice do that? Or is that advanced stuff? Regards -- and sorry for my stupid questions :) - -- tomás -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.1 (GNU/Linux) iD8DBQFGu1ciBcgs9XrR2kYRAnVqAJ9mSc463I44JxLLDPpUZ/JirUVH5ACeOhUr 2/4aXs0ukMnvP8YCq8pamwQ= =IgfC -END PGP SIGNATURE- ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] crypting prosrc in pg_proc
Make a loadable PL plpgsqlsec or something like that that's just a thin wrapper around the plpgsql call handler, and all it does is decrypt the source text. perfect idea, simple perfect. i did not consider that yet. i was hoping for some enlightenment like that. would be a nice module for contrib or pgfoundry ... many thanks, hans -- Cybertec Geschwinde Schönig GmbH Gröhrmühlgasse 26, 2700 Wiener Neustadt Tel: +43/1/205 10 35 / 340 www.postgresql.at, www.cybertec.at
Re: [HACKERS] RIP: Buildfarm member Baiji ??
The extinction of a dolphin ... sign of things to come? I've got a dolphin too (somebody's doing this on purpose ;): http://www.pgbuildfarm.org/cgi-bin/show_history.pl?nm=tucuxibr=HEAD A few hours ago I realized the vmware instance it's running in had been suspended in a snapshot for the last few days... 1 extinct, 1 in coma - not going too well for dolphins these days ;) Bye, Chris. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[HACKERS] 2D partitioning of VLDB - sane or not?
I am building up a schema for storing a bunch of data about proteins, which on a certain level can be modelled with quite simple tables. The problem is that the database I am building needs to house lots of it 10TB and growing, with one table in particular threatening to top 1TB. In the case of the table and in the case of the overall database, the size can be expected to grow quickly (and most of it can never be deleted). In the past, with smaller tables, I have had success partitioning on a 64-bit crc hash that takes a more or less uniform distribution of input data and pumps out a more-or-less uniform distribution of partitioned data with a very small probability of collision. The hash itself is implemented as a c add-on library, returns a BIGINT and serves as a candidate key for what for our purposes we can call a protein record. Now back to the big table, which relates two of these records (in a theoretically symmetric way). Assuming I set the the table up as something like: CREATE TABLE big_protein_relation_partition_dimA_dimB{ protein_id_a BIGINTEGER NOT NULL CHECK( bin_num(protein_id_a) = dimA ), --- key (hash) from some table protein_id_a BIGINTEGER NOT NULL CHECK( bin_num(protein_id_b) = dimB ), --- key (hash) from some table ... } and do a little c bit-twiddling and define some binning mechanism on the BIGINTEGERs. As near I can tell, binning out along the A and B dimensions into 256 bins, I shouldn't be in any danger of running out of OIDs or anything like that (despite having to deal with 2^16 tables). Theoretically, at least, I should be able to do UNIONS along each axis (to avoid causing the analyzer too much overhead) and use range exclusion to make my queries zip along with proper indexing. Aside from running into a known bug with too many triggers when creating gratuitous indices on these tables, I feel as it may be possible to do what I want without breaking everything. But then again, am I taking too many liberties with technology that maybe didn't have use cases like this one in mind? Jason -- Jason Nerothin Programmer/Analyst IV - Database Administration UCLA-DOE Institute for Genomics Proteomics Howard Hughes Medical Institute 611 C.E. Young Drive East | Tel: (310) 206-3907 105 Boyer Hall, Box 951570 | Fax: (310) 206-3914 Los Angeles, CA 90095. USA | Mail: [EMAIL PROTECTED] http://www.mbi.ucla.edu/~jason
Re: [HACKERS] crypting prosrc in pg_proc
On Thursday 09 August 2007 11:30, Joshua D. Drake wrote: Andrew Sullivan wrote: On Thu, Aug 09, 2007 at 04:42:19PM +0200, Hans-Juergen Schoenig wrote: the problem here is that vendors of appliances don't want people to spider their codes. this is a fact - it is not the idea of open source to do so but bloody reality. in addition to that people are not willing to code everything in C just to hide. Well, then, they're out of luck. This whole thread can be summed up as: plpgsql is an interpretive language, not a compiled on. You can and do see the source, just like perl, ruby, python, bash etc... If you don't like it, don't use it and use C instead. :) Keep in mind that the obfuscation techniques that can be used by python and perl are all reversible. Hmm I wonder if you could wire plphp through something like Zend Gaurd? -- Robert Treat Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Compilation of pg 7.4.17 fails on HP-UX
On Thu, Aug 09, 2007 at 11:46:47AM -0400, Tom Lane wrote: Adrian Maier [EMAIL PROTECTED] writes: I have just tried to compile postgresql 7.4.17 on a HP-UX 11.11 box (PA-RISC) and the compliation fails in gist. I do not think anyone cares about making 7.4.x run on platforms it did not support before. Use a newer PG release. Actually, this is PA-RISC, not the numerous emails we've gotten this week about HPUX on ia64 (what is it with the HPUX guys this week?) http://www.postgresql.org/docs/7.4/interactive/supported-platforms.html indicates that HPUX on PA-RISC should work... -- Decibel!, aka Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) pgpGfcyDpEN3F.pgp Description: PGP signature
Re: [HACKERS] crypting prosrc in pg_proc
On Thu, Aug 09, 2007 at 11:41:02AM -0400, Andrew Dunstan wrote: Decibel! wrote: This is also related to the desire to be able to restrict access to the catalog tables. Doing so could potentially solve this problem; it solves other issues (such as being able to see all the databases that exist on a server, something that hosting environments care about). You can hide the catalogs, albeit at the cost of some functionality. I did some experimentation a couple of years back with removing public access from the catalogs, removing information_schema and the public schema, etc, and it worked quite well. I set up a user who had access to a single schema, which only contained functions, and the user wasn't able (so far as I could determine) to see anything other than those functions - no tables, no catalogs, no databases, no users. The user was still able to function exactly as intended. The intended scenario was for a web app user, where the web server was subverted, the aim being to restrict the amount of information the intruder could steal. That doesn't help with information leaking in shared hosting setups, I agree. No, but that combined with row-level security might. Actually, if we had a standard set of views that all the tools were expected to use instead of the raw catalogs, it wouldn't be hard at all to secure things in a hosted environment. -- Decibel!, aka Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) pgpIPQ0HPLZrD.pgp Description: PGP signature
Re: [HACKERS] crypting prosrc in pg_proc
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hans-Juergen Schoenig wrote: On Aug 9, 2007, at 5:03 PM, Greg Smith wrote: On Thu, 9 Aug 2007, Andrew Dunstan wrote: There are also some fairly impressive code obfuscators about, that your clients might find useful. All they really need is to find a sufficiently clever PL/Perl programmer. we should make this a PL/Brainfuck implementation then ;) PL/Whitespace http://compsoc.dur.ac.uk/whitespace/ hans -- Cybertec Geschwinde Schönig GmbH Gröhrmühlgasse 26, 2700 Wiener Neustadt Tel: +43/1/205 10 35 / 340 www.postgresql.at, www.cybertec.at - -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGu6dbATb/zqfZUUQRAm5lAKCvNjklINez5AfioKTJ2FT4BeZ1yQCfbvXz //mGwmoSYWsiAbdAtOR9/vU= =QaI0 -END PGP SIGNATURE- ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Wrote a connect-by feature
Bertram Scharpf wrote: Hi, Am Mittwoch, 08. Aug 2007, 14:04:23 +0200 schrieb Bertram Scharpf: Am Mittwoch, 08. Aug 2007, 00:52:41 +0200 schrieb Bertram Scharpf: http://www.bertram-scharpf.de/tmp/connectby.tar.gz Nobody's answering just a line. Is it such a bore? Isn't connect by the Oracle non-standard way of doing recursive queries? Please review the mailing list archives on this subject. Also, many people have their heads down trying to punch out a release, right now. cheers andrew ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Wrote a connect-by feature
Hi, Am Mittwoch, 08. Aug 2007, 14:04:23 +0200 schrieb Bertram Scharpf: Am Mittwoch, 08. Aug 2007, 00:52:41 +0200 schrieb Bertram Scharpf: http://www.bertram-scharpf.de/tmp/connectby.tar.gz Nobody's answering just a line. Is it such a bore? Bertram -- Bertram Scharpf Stuttgart, Deutschland/Germany http://www.bertram-scharpf.de ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] createlang/droplang -l outputs
Tomoaki Sato [EMAIL PROTECTED] writes: The header in the list of already installed languages shown by createlang/droplang with the -l option is not printed from 8.2. Ooops. I seem to have missed these uses of printQuery() when we put in the FETCH_COUNT patch last summer :-(. + popt.topt.start_table = true; It looks like stop_table has to be set true as well to match the former formatting; compare http://developer.postgresql.org/cvsweb.cgi/pgsql/src/bin/psql/startup.c.diff?r1=1.136;r2=1.137 Patched in HEAD and 8.2 branch. Thanks for the report! regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Unexpected VACUUM FULL failure
Simon Riggs [EMAIL PROTECTED] writes: Good hunch. I plugged this hole earlier, but on further inspection I can see the plug wasn't wide enough. XLogAsyncCommitFlush() is good enough, but HeapTupleSatisfiesVacuum() still allowed the inexact bookkeeping to sometimes skip hint bit setting, when executed with concurrent transactions touching other tables. ISTM that if we call HeapTupleSatisfiesVacuum() with an additional boolean parameter, force, we can tell VF to always set the hint bits in every case, not just HEAP_MOVED_IN and HEAP_MOVED_OUT. Surely this approach is no good: won't it allow hint bits to reach disk in advance of their transaction? I think it'd be safer, and a lot less ugly, to recast the tests in VACUUM FULL. If we make the first pass clear any old MOVED_IN/MOVED_OUT bits then the last pass can key off those instead of assuming that XMIN_COMMITTED is set everywhere. Then we'd not need XLogAsyncCommitFlush, which is a kluge anyway. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Wrote a connect-by feature
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Andrew Dunstan wrote: Bertram Scharpf wrote: Hi, Am Mittwoch, 08. Aug 2007, 14:04:23 +0200 schrieb Bertram Scharpf: Am Mittwoch, 08. Aug 2007, 00:52:41 +0200 schrieb Bertram Scharpf: http://www.bertram-scharpf.de/tmp/connectby.tar.gz Nobody's answering just a line. Is it such a bore? I have it downloaded but have not had time to test it. Isn't connect by the Oracle non-standard way of doing recursive queries? Please review the mailing list archives on this subject. Also, many people have their heads down trying to punch out a release, right now. cheers andrew ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match - -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGu7mdATb/zqfZUUQRAgFSAJ0U2bPpRfV1G1TXbyBmakwqC4WfhwCcD3B2 wNWKLIF7FqFVo9KMVKMXtt8= =LPls -END PGP SIGNATURE- ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[HACKERS] pgcheck - data integrity check
Hi, I am working on a data integrity check tool (pgcheck). I would like to discuss the following issues: Right now I am working on a function, which should check the validity of pages in relation. The relation is passed to the function as its argument (its oid). For the integrity check of a page, I am using an AccessShare lock on a relation as you can see on http://cvs.pgfoundry.org/cgi-bin/cvsweb.cgi/pgcheck/pokus/pgcheck_page/ . In near future, I would like to extend the functionality also with a recovery tool which would be able to repair broken pages. Should the function also repair the corrupted data on a page, the AccessShare lock on a relation would not be sufficient. But on the other hand AccessExclusive lock on the entire relation could significantly influence the performance of a database. So far I see these possibilities: 1- use AccessShare lock for the integrity check function and in case of faulty page, pass this page to a special function, which would lock the page using AccessExclusiveLock on the relation for correction. + it would not influence the performance so much - higher complexity 2- use one function which would lock the relation with AccessExclusive lock, check the integrity of data and in case of faulty pages, it would repair it at once. + easier to implement - could cause performance downturn because of relatively long Exclusive lock on some relations. 3- use the AccessShare lock for the integrity check and use special single-user mode for recovery of data + safest option for recovery of data - long down time of database Furhter, I would like to know your opinions on what should be checked next in order to check the integrity of data in database. I am thinking of checking: -in case of variable-length data, compare the formal and actual size of data -check whether constrains applied on items are fulfilled -compare data in indexes with indexed tables, whether they are correct Robert P.S. Any comments to the c-funtion I made so far a welcome ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Unexpected VACUUM FULL failure
Tom Lane [EMAIL PROTECTED] writes: Simon Riggs [EMAIL PROTECTED] writes: ISTM that if we call HeapTupleSatisfiesVacuum() with an additional boolean parameter, force, we can tell VF to always set the hint bits in every case, not just HEAP_MOVED_IN and HEAP_MOVED_OUT. Surely this approach is no good: won't it allow hint bits to reach disk in advance of their transaction? I don't think so since it sounds like he's saying to still sync the log and VACUUM FULL has an exclusive lock on the table. So any committed (or aborted) changes it sees in the table must have been committed or aborted before the log sync. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Compilation of pg 7.4.17 fails on HP-UX
On 8/10/07, Tom Lane [EMAIL PROTECTED] wrote: Decibel! [EMAIL PROTECTED] writes: On Thu, Aug 09, 2007 at 11:46:47AM -0400, Tom Lane wrote: Adrian Maier [EMAIL PROTECTED] writes: I have just tried to compile postgresql 7.4.17 on a HP-UX 11.11 box (PA-RISC) and the compliation fails in gist. I do not think anyone cares about making 7.4.x run on platforms it did not support before. Use a newer PG release. Actually, this is PA-RISC, not the numerous emails we've gotten this week about HPUX on ia64 (what is it with the HPUX guys this week?) Oh, my mistake --- obviously hadn't consumed enough caffeine this morning. [ digs around a bit... ] However, I might have been right for the wrong reasons. I'm thinking Adrian is trying to build for 64-bit HPPA (a beast the 7.4 supported-platforms matrix doesn't know about), and is falling foul of this problem: http://archives.postgresql.org/pgsql-hackers/2004-03/msg01196.php Teodor fixed that in the 8.0 devel cycle, but couldn't back-patch it because it meant an on-disk layout change of gist indexes. What I suspect is that gcc 3.3.3 doesn't have the specific error checks I complained of in the above message, but simply generates bogus assembly code for the incorrect C code :-( So it looks like this is simply a known issue that hasn't been backpatched. I'll simply use 8.2 . Thanks for your answers, Adrian Maier ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Compilation of pg 7.4.17 fails on HP-UX
Adrian Maier [EMAIL PROTECTED] writes: On 8/10/07, Tom Lane [EMAIL PROTECTED] wrote: [ digs around a bit... ] However, I might have been right for the wrong reasons. I'm thinking Adrian is trying to build for 64-bit HPPA (a beast the 7.4 supported-platforms matrix doesn't know about), and is falling foul of this problem: http://archives.postgresql.org/pgsql-hackers/2004-03/msg01196.php So it looks like this is simply a known issue that hasn't been backpatched. You didn't confirm in so many words: were you trying to build 64-bit? I'd have expected a 32-bit build to work. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly