Re: [GENERAL] Limit on number of users in postgresql?
The more I think about it, the more I think a proxy app is necessary. It seems like a lot of work just for security issues, but basically most web based database apps use this model, with the web application acting as a proxy between the database and the client. This is how I've seen it done on almost every application I've worked on. If you have multiple apps hitting a single DB, usually each application has it's own role. But user-level security is controlled at the application-level. Although I don't think there's anything *wrong* with having a role-per-user (it could provide an extra layer of security), I think it's much more flexible to define security in the application/business logic layer. That being said, we shouldn't get too wound up over this limitation of PostgreSQL until someone finds that there really is some real-world performance issue. AFAIK, everything in this thread is theoretical. Cheers, Jeremy Haile ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] Stats collector frozen?
Will do - thanks Magnus! I'll test it for a while and post the results here. Jeremy Haile On Sat, 27 Jan 2007 17:21:23 +0100, Magnus Hagander [EMAIL PROTECTED] said: Jeremy Haile wrote: Using standard build (none of the things you mentioned) on 8.2.1 currently. I really appreciate it! Ok. I've built a binary for you at http://www.hagander.net/download/pgsql_8.2.1_win32select.zip Note that while I named the file 8.2.1 it really isn't - it's current head of the REL_8_2_STABLE branch. Which means it has a few other fixes as well. I think it *should* work to just replace the postgres.exe file and ignore the rest, so I didn't put them in the ZIP. But as always, make sure you take a backup first :-) //Magnus ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] Stats collector frozen?
So far, the patched binary seems to have fixed the statistics issue in my environment. pgstat.stat is now updating constantly, pg_stat_user_tables is updating correctly, and autovacuum is running! Thanks for your quick response to this issue! Let me know if there is anything else I can do to assist. Jeremy Haile On Sat, 27 Jan 2007 11:23:39 -0500, Jeremy Haile [EMAIL PROTECTED] said: Will do - thanks Magnus! I'll test it for a while and post the results here. Jeremy Haile On Sat, 27 Jan 2007 17:21:23 +0100, Magnus Hagander [EMAIL PROTECTED] said: Jeremy Haile wrote: Using standard build (none of the things you mentioned) on 8.2.1 currently. I really appreciate it! Ok. I've built a binary for you at http://www.hagander.net/download/pgsql_8.2.1_win32select.zip Note that while I named the file 8.2.1 it really isn't - it's current head of the REL_8_2_STABLE branch. Which means it has a few other fixes as well. I think it *should* work to just replace the postgres.exe file and ignore the rest, so I didn't put them in the ZIP. But as always, make sure you take a backup first :-) //Magnus ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] Stats collector frozen?
Apparantly there is a bug lurking somewhere in pgwin32_select(). Because if I put a #undef select right before the select in pgstat.c, the regression tests pass. May be, problem is related to fixed bug in pgwin32_waitforsinglesocket() ? WaitForMultipleObjectsEx might sleep indefinitely while waiting socket to write, so, may be there is symmetrical problem with read? Or pgwin32_select() is used for waiting write too? pgwin32_waitforsinglesocket() appears to work fine. And we only use FD_READ, so it's not affected by your patch from what I can tell. I've got it passnig tests with select replaced with waitforsinglesocket - now I just need to implement timeout in that one :-) Attached patch seems to solve the problem on my machine at least. Uses pgwin32_waitforsinglesocket() instead of pgwin32_select(). Changes pgwin32_waitforsinglesocket() to accept the timeout as a parameter (this is why it touches files outside of the stats area). Magnus - thanks for your investigation and work! Any chance I could get a patched exe for win32 and test on my servers? ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] Stats collector frozen?
We have had lots of reports of issues with the stats collector on Windows. Some were definitly fixed by the patch by OT, but I don't think all. Here were a couple of other reports I found: http://archives.postgresql.org/pgsql-hackers/2006-09/msg00415.php http://archives.postgresql.org/pgsql-hackers/2006-04/msg00127.php The thing is, since it didn't give any error messages at all, most users wouldn't notice. Other than their tables getting bloated, in which case they would do a manual vacuum and conlcude autovacuum wasn't good enough. Or something. This is indeed what I assumed at first. I started running vacuum analyze hourly and turned off autovacuum. Later, I decided to try the autovacuum route again and investigated why it wasn't working well. Magnus - could you send me a patched exe to try in my environment? Would it be compatible with 8.2.1? ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] [Fwd: [PORTS] M$ SQL server DTS package equivalent in
I've also used Pentaho Data Integration (previously known as Kettle) quite extensively, and can recommend it. It supports many different databases and has fairly good documentation (although thin in some areas). It has a GUI drag-and-drop tool that can be used to configure transformations and is very flexible. It also has an active community that responds when you have issues. I use it as part of a regular job that runs every 5 minutes and hourly to copy and transform data from a SQL Server DB to a PostgreSQL DB. I use COPY when I can simply select data into a CSV and load it into another DB - but as Tomi said, when you have to do primary key generation, row merging, data cleanup, and data transformations - I would use some sort of ETL tool over just SQL. My 2 cents, Jeremy Haile On Fri, 26 Jan 2007 15:14:22 +, Tomi N/A [EMAIL PROTECTED] said: Besides being easy to schedule and very flexible, manipulating data with queries is extremely powerful and fairly easy to maintain assuming you know a little SQL -- thanks to postgresql's huge array of built in string manipulation functions. Your skills learned here will pay off using the database as well for other things. Not only that, but this approach will be fast since it is declarative and handles entire tables at once as opposed to DTS-ish solutions which tend to do processing record by record. Not to mention they are overcomplicated and tend to suck. (DTS does have the ability to read from any ODBC source which is nice...but that does not apply here). Different strokes for different folks, it seems. I'd argue that COPY followed by a barrage of plpgsql statements can't be used for anything but the most trivial data migration cases (where it's invaluable) where you have line-organized data input for a hand-full of tables at most. In my experience (which is probably very different from anyone else's), most real world situations include data from a number of very different sources, ranging from the simplest (.csv and, arguably, .xml) to the relatively complex (a couple of proprietary databases, lots of tables, on-the fly row merging, splitting or generating primary keys, date format problems and general pseudo-structured, messed up information). Once you've got your data in your target database (say, pgsql), using SQL to manipulate the data makes sense, but it is only the _final_ step of an average, real world data transformation. Cheers, t.n.a. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/ ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
Re: [GENERAL] Can you specify the pg_xlog location from a config file?
This utility is useful for creating junctions in Windows: http://www.microsoft.com/technet/sysinternals/FileAndDisk/Junction.mspx I am using this to symlink my pg_xlog directory to another disk and it works great. Jeremy Haile On Fri, 26 Jan 2007 18:27:04 +, Roman Neuhauser [EMAIL PROTECTED] said: # [EMAIL PROTECTED] / 2007-01-26 09:21:27 -0800: Windows doesn't support symlinks. Is it possible instead for there to be a config file that lets one set where the pg_xlog directory will sit? Windows has junction points. -- How many Vietnam vets does it take to screw in a light bulb? You don't know, man. You don't KNOW. Cause you weren't THERE. http://bash.org/?255991 ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] Stats collector frozen?
Applied to HEAD and 8.2 --- assuming the Windows buildfarm machines go green, we should probably consider back-porting this to 8.1 and 8.0. Not trying to be a nuisance, but I'd really like to try this out in my environment and see if my problems disappear. Is there anyone out there who could provide me with a patched exe for Win32? If not, I could try to get my system setup to build for Windows, but I'm not sure what all that would involve. Thanks, Jeremy Haile ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] Stats collector frozen?
Using standard build (none of the things you mentioned) on 8.2.1 currently. I really appreciate it! On Fri, 26 Jan 2007 21:24:09 +0100, Magnus Hagander [EMAIL PROTECTED] said: Jeremy Haile wrote: Applied to HEAD and 8.2 --- assuming the Windows buildfarm machines go green, we should probably consider back-porting this to 8.1 and 8.0. Not trying to be a nuisance, but I'd really like to try this out in my environment and see if my problems disappear. Is there anyone out there who could provide me with a patched exe for Win32? If not, I could try to get my system setup to build for Windows, but I'm not sure what all that would involve. I'll see if I can build you something tomorrow. You're on 8.2, right? Do you use any features like Kerberos, SSL or NLS? I don't think I have them set up properly in my mingw build env, so it'd be easier if I could build without them. Unless beaten by someone who has a complete env ;-) //Magnus ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] Stats collector frozen?
Tom, Did this information shed any light on what the problem might be? Any solution or workaround? Thanks! Jeremy Haile On Wed, 24 Jan 2007 14:19:05 -0500, Jeremy Haile [EMAIL PROTECTED] said: pgstat.stat was last updated 1/22 12:25pm - there is no pgstat.tmp. Coincidentally (I think not) - the last auto-analyze was performed at 2007-01-22 12:24:11.424-05. The logs for 1/22 are empty - so no errors or anything like that to give clues... Thanks! Jeremy Haile On Wed, 24 Jan 2007 14:00:52 -0500, Tom Lane [EMAIL PROTECTED] said: Jeremy Haile [EMAIL PROTECTED] writes: The stats collector appears to still be running, since I can see a postgres.exe process with -forkcol. However, I never notice it using I/O or CPU usage. Also, querying the pg_stat_user_tables view shows no change in table stats even for tables that change very frequently. Is $PGDATA/global/pgstat.stat getting updated? (watch the file mod time at least) Do you see any pgstat.tmp file in there? regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] Stats collector frozen?
Unfortunately I don't have any debugging tools installed that would work against postgres - although I'd be glad to do something if you could tell me the steps involved. I can reproduce the issue quite easily on two different Windows machines (one is XP, the other is 2003). Please let me know if there is anything else I can do to help debug this problem. Do you know of any workaround other than restarting the whole server? Can the collector be restarted individually? Thanks, Jeremy Haile On Thu, 25 Jan 2007 12:42:11 -0500, Tom Lane [EMAIL PROTECTED] said: Jeremy Haile [EMAIL PROTECTED] writes: Did this information shed any light on what the problem might be? It seems to buttress Magnus' theory that the intermittent (or not so intermittent) stats-test buildfarm failures we've been seeing have to do with the stats collector actually freezing up, rather than just not reacting fast enough as most of us (or me anyway) thought. But why that is happening remains anyone's guess. I don't suppose you have debugging tools that would let you get a stack trace from the collector process? regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] Stats collector frozen?
Then just pick it up in Task Manager or Process Explorer or whatever and kill it off. Just make sure you pick the right process. I mentioned earlier that killing off the collector didn't work - however I was wrong. I just wasn't giving it enough time. If I kill the postgres.exe -forkcol process, it does gets restarted, although sometimes it takes a minute. Since it only seems to update pgstat.stat once after restarting, I'd need to kill it once-a-minute to keep my statistics up to date =) So, unfortunately it's not a great workaround to my problem. Jeremy Haile ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] Stats collector frozen?
I'll try to put together a test case for hackers, although I'm not sure what exactly causes it. Basically, when I fire up PostgreSQL - after about a minute the stats collector runs once (pgstat.stat is updated, autovacuum fires up, etc.) - and then the collector seems to hang. If I watch it's performance information, it does not read or write to disk again and pgstat.stat is never updated again. It never updates pgstat.stat more than once after restart. There are no errors in the log I tried killing the collector a variety of ways on Windows, but it seems to terminate indefinitely. I don't see a kill program for windows that lets me specify the signal to use. So other than restarting PostgreSQL, I'm not sure how to workaround this problem. If anyone else is experiencing similar problems, please post your situation. On Thu, 25 Jan 2007 12:51:31 -0500, Tom Lane [EMAIL PROTECTED] said: Jeremy Haile [EMAIL PROTECTED] writes: Unfortunately I don't have any debugging tools installed that would work against postgres - although I'd be glad to do something if you could tell me the steps involved. I can reproduce the issue quite easily on two different Windows machines (one is XP, the other is 2003). Sorry, I don't know anything about Windows debugging either. Can you put together a test case that would let one of the Windows-using hackers reproduce it? Do you know of any workaround other than restarting the whole server? Can the collector be restarted individually? On Unix you can just 'kill -TERM' the collector process and the postmaster will start a new one without engaging in a database panic cycle. Dunno what the equivalent is on Windows but it's probably possible. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] Stats collector frozen?
AFAIR (Magnus can surely confirm) there were some other tables that weren't showing stats as all zeros -- but there's no way to know whether those numbers were put there before the collector had frozen (if that's really what's happening). Yeah - I have numbers that updated before the stats collector started freezing. Do you know which version of PG this started with? I have upgraded 8.1.3, 8.1.4, 8.2, and 8.2.1 in the past months and I didn't have the collector enabled until 8.2.1 - so I'm not sure how long this has been a problem. I might try rolling back to a previous version - it's either that or setup a scheduled vacuum analyze until we figure out this problem. I'm having to manually run it every day now... =) I think this is a pretty critical problem since it cripples autovacuum on Windows. Are you guys in a position to debug the collector process and see where it is freezing (ala Tom's earlier comment)? Anything I can do to help debug this problem faster? ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[GENERAL] Stats collector frozen?
I've noticed that my tables are not being auto vacuumed or analyzed regularly, even though I have very aggressive autovacuum settings. The stats collector appears to still be running, since I can see a postgres.exe process with -forkcol. However, I never notice it using I/O or CPU usage. Also, querying the pg_stat_user_tables view shows no change in table stats even for tables that change very frequently. I see error such as these in the log every now and then - not sure if they are related. These have been discussed at length in other posts and seems to have something to do with PG holding onto old file handles (Windows specific): 2007-01-24 06:24:16 ERROR: could not open relation 1663/16404/333779: Permission denied 2007-01-24 06:24:17 ERROR: could not open relation 1663/16404/333779: Permission denied 2007-01-24 06:24:18 ERROR: could not open relation 1663/16404/333779: Permission denied 2007-01-24 06:24:19 ERROR: could not open relation 1663/16404/333779: Permission denied I'm running PG 8.2.1 on Windows. Here is some of the output from show all: autovacuum;on autovacuum_analyze_scale_factor;0.02 autovacuum_analyze_threshold;250 autovacuum_freeze_max_age;2 autovacuum_naptime;1min autovacuum_vacuum_cost_delay;-1 autovacuum_vacuum_cost_limit;-1 autovacuum_vacuum_scale_factor;0.08 autovacuum_vacuum_threshold;500 stats_block_level;on stats_command_string;on stats_reset_on_server_start;off stats_row_level;on stats_start_collector;on vacuum_cost_delay;20ms vacuum_cost_limit;200 vacuum_cost_page_dirty;20 vacuum_cost_page_hit;1 vacuum_cost_page_miss;10 vacuum_freeze_min_age;1 ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] Stats collector frozen?
pgstat.stat was last updated 1/22 12:25pm - there is no pgstat.tmp. Coincidentally (I think not) - the last auto-analyze was performed at 2007-01-22 12:24:11.424-05. The logs for 1/22 are empty - so no errors or anything like that to give clues... Thanks! Jeremy Haile On Wed, 24 Jan 2007 14:00:52 -0500, Tom Lane [EMAIL PROTECTED] said: Jeremy Haile [EMAIL PROTECTED] writes: The stats collector appears to still be running, since I can see a postgres.exe process with -forkcol. However, I never notice it using I/O or CPU usage. Also, querying the pg_stat_user_tables view shows no change in table stats even for tables that change very frequently. Is $PGDATA/global/pgstat.stat getting updated? (watch the file mod time at least) Do you see any pgstat.tmp file in there? regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] Stats collector frozen?
Searching the archives, I found a couple of 2006 posts that seem somewhat related to my problem (although I don't see any solutions listed...): http://archives.postgresql.org/pgsql-hackers/2006-09/msg00415.php http://archives.postgresql.org/pgsql-hackers/2006-04/msg00127.php Tom, since you were involved in these - did you ever figure out how to resolve the issues of the stats collector getting stuck in Windows? Thanks, Jeremy Haile On Wed, 24 Jan 2007 14:19:05 -0500, Jeremy Haile [EMAIL PROTECTED] said: pgstat.stat was last updated 1/22 12:25pm - there is no pgstat.tmp. Coincidentally (I think not) - the last auto-analyze was performed at 2007-01-22 12:24:11.424-05. The logs for 1/22 are empty - so no errors or anything like that to give clues... Thanks! Jeremy Haile On Wed, 24 Jan 2007 14:00:52 -0500, Tom Lane [EMAIL PROTECTED] said: Jeremy Haile [EMAIL PROTECTED] writes: The stats collector appears to still be running, since I can see a postgres.exe process with -forkcol. However, I never notice it using I/O or CPU usage. Also, querying the pg_stat_user_tables view shows no change in table stats even for tables that change very frequently. Is $PGDATA/global/pgstat.stat getting updated? (watch the file mod time at least) Do you see any pgstat.tmp file in there? regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] Postgresql.conf
But there are ways that we could optimize count(*) queries for specific circumstances right? Obviously this isn't trivial, but I think it would be nice if we could maintain a number of rows count that could be used when performing a count(*) on the whole table (no where clause). I don't know if the overhead of keeping track of that number is worth the benefits - but I know that querying for the number of rows in a table is a common need and other RDBMSs do optimize for that special case. On Tue, 23 Jan 2007 12:53:43 -0600, Bruno Wolff III [EMAIL PROTECTED] said: On Tue, Jan 23, 2007 at 10:12:13 -0500, Brandon Aiken [EMAIL PROTECTED] wrote: Out of curiosity, has the COUNT(*) with no WHERE clause slowness been fixed in 8.x? Or is it still an issue of there's no solution that won't harm aggregates with WHERE clauses? Probably not in the sense that you mean. The underlying problem is that in MVCC there is no single global answer to the question and the pain of maintaining the mutliple answers outweighs the cost of doing so in normal usage. People that need to run count(*) queries a lot may want to make a different trade off and some ways of maintaining counts are covered in the archives. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
Re: [GENERAL] Index bloat of 4x
Is it feasible to add a reindex concurrently that doesn't lock the table for the rebuild, then locks the table when doing a second pass to pickup rows that were changed after the first pass? Or something like that On Fri, 19 Jan 2007 12:45:03 -0500, Tom Lane [EMAIL PROTECTED] said: Ed L. [EMAIL PROTECTED] writes: Online index creation definitely helps us toward 24x7. But wouldn't we still have to drop the old index, thus blocking production queries? Yes, but only for a very short period. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] Multiple column index usage question
That's interesting. So if you have a composite index on two columns, is there much of a reason (usually) to create single indexes on each of the two columns? I guess the single indexes might be slightly faster depending on the number of different values/combinations, so probably it depends eh? On Fri, 19 Jan 2007 16:57:42 -0600, Ron Johnson [EMAIL PROTECTED] said: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 01/19/07 15:53, Jan Muszynski wrote: Rather simple question, of which I'm not sure of the answer. If I have a multiple column index, say: Index index1 on tableA (foo,bar) and I then: Select * from tableA where foo = some value Will index1 be used, or am I looking at a seqscan in all circumstances? Yes, it will use the index. However, in earlier versions, the lvalue rvalue needed to match in type to use the index. -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFFsUzmS9HxQb37XmcRArY8AKDqzS5FeY1HwkSGeOlhQsjsdpAV5gCghiWj R4e7rBWaAAGF25ZFhy1Elgc= =Wkp8 -END PGP SIGNATURE- ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] GUID/UUID Support
I second the desire for a UUID type in PostgreSQL! I'm aware of the pguuid project, but it's not the same as having it in core and isn't very well maintained. This is such a common database paradigm that it seems reasonable to promote it to first-class citizen status in PostgreSQL. I currently use varchars for UUIDs, but the table size, index size, integrity (validation), and performance would be better if PostgreSQL supported UUIDs directly. On Thu, 18 Jan 2007 10:31:26 -0700, Patrick Earl [EMAIL PROTECTED] said: One issue is that UUIDs are only 16 bytes of data. To store the as text in canonical form requires 36 bytes. As there are alternate frequently used representations, you also run into potential issues with input. The GUID type (proposed by Gevik) handles those standard input variations. Though I haven't tried it, I would imagine there would be performance implications when using 36 character keys everywhere to do indexing, joins, etc. Another issue is that higher level languages (such as Delphi and .NET) have GUID field types built in. If the field is just a string field, it won't map nicely to those higher level types. Patrick On 1/17/07, Chad Wagner [EMAIL PROTECTED] wrote: On 1/17/07, Patrick Earl [EMAIL PROTECTED] wrote: Certainly support for the GUID field type itself is most important. As for the generators, though they are non-essential, they are very useful. Other platforms and libraries have standardized on uuid generators, so I don't see why PostgreSQL can't. Maybe I am oblivious to the reason, but why is there a need for a special data type for GUID/UUIDs? Wouldn't you always be doing an equality anyways? Wouldn't a varchar suffice? -- Chad http://www.postgresqlforums.com/ ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] GUID/UUID Support
Yeah, but it's not going to be added to core until there's some agreement about *what* needs to be added. The point of the external project is that once it has acheived a level of support *then* it can be incorporated. That's fair. In truth, I only found that pguuid existed fairly recently and haven't had a chance to try it out. I've been scared away a bit from using it do to threads I've read about problems and limitations with it - also there are bug reports listed on the site which haven't been updated in over a year. Still - I'd like to give it a try myself at some point. Has anyone here had a great experience using pguuid? ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
[GENERAL] Diagnosing deadlock / connection hang
I have a process that appears to hang every night. I ran the following query and results, and it looks like an autoanalyze and query are waiting on a lock that's being exclusively held by a transaction that is IDLE. Any ideas? Any additional queries I should run to shed light on the issue? SELECT pg_database.datname AS database, pg_class.relname AS table, transaction, pid, mode, granted, usename, current_query, backend_start FROM pg_locks, pg_class, pg_database, pg_stat_activity WHERE pg_locks.relation = pg_class.oid AND pg_locks.database = pg_database.oid and pg_locks.pid = pg_stat_activity.procpid order by pg_database.datname, pg_class.relname, granted dashboard;activity_log_transition;18291984;1588;ShareUpdateExclusiveLock;f;postgres;ANALYZE public.activity_log_transition;2007-01-17 00:06:03.281-05 dashboard;activity_log_transition;18291982;2872;AccessShareLock;f;dashboard;select distinct strDescription from activity_log_transition;2007-01-17 00:05:03.281-05 dashboard;activity_log_transition;18291979;1472;ShareLock;t;dashboard;IDLE in transaction;2007-01-17 00:05:00.968-05 dashboard;activity_log_transition;18291979;1472;AccessExclusiveLock;t;dashboard;IDLE in transaction;2007-01-17 00:05:00.968-05 dashboard;activity_log_transition_pkey;18291979;1472;AccessExclusiveLock;t;dashboard;IDLE in transaction;2007-01-17 00:05:00.968-05 ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] insert only unique values in to a table, ignore rest?
Note that things will go faster if you do your initial data load using copy from stdin for the initial bulk data load. individual inserts in postgresql are quite costly compared to mysql. It's the transactional overhead. by grouping them together you can make things much faster. copy from stdin does all the inserts in one big transaction. You could do copy from file as well right? (no performance difference compared to copy from stdin) I do this all the time. Also - maybe I misunderstand something, but why does PostgreSQL's implementation prohibit it from ignoring insert errors during a copy? If you added a unique constraint to the table before copying, PostgreSQL would generate errors due to the unique constraint violation - so I don't think any additional locking would be required for it to simply say If there is an error while copying in, ignore it and continue inserting other rows PostgreSQL's copy command doesn't currently support this, so the temp table followed by a distinct select is the way to go. But I didn't follow all of the talk about it requiring locking the table and being inherently impossible for PostgreSQL to support. I've wanted a similar feature. I select rows into a table on a regular basis. I'd like to be able to overlap old values and have PostgreSQL ignore failed inserts. SQL Server offers a flag that allows you to ignore inserts whose primary key already exists in the table. The only solution in PostgreSQL is to run a query to manually delete the duplicate rows from a temp table before inserting - which takes much more time. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] Database versus filesystem for storing images
I wonder why this HTTP cache headers argument didn't surface in this heated debate. I mentioned this earlier as well. Although you could do it in the app layer - it would be easier to just let the web server handle it. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
Re: [GENERAL] Database versus filesystem for storing images
It's almost always better to store the images on the file system and just store the filename or relative path in the database. This is more efficient, doesn't bloat the database by storing files in it, and is easier to get proper browser caching behavior (depending on how your app is setup). I try to avoid BLOBs whenever possible. Cheers, Jeremy Haile On Fri, 5 Jan 2007 17:18:10 -0200, Clodoaldo [EMAIL PROTECTED] said: 5 Jan 2007 06:59:18 -0800, imageguy [EMAIL PROTECTED]: I think I know the answer, If you know the answer please tell it as I have read some discussions on the web and although I have decided on a solution I'm still not sure about the best answer, if there is a best answer after all. but if you don't have an application server - ie a webserver, etc, Yes I have an application server, the Apache server. and many of the workstations/clients that need access to the images but may not have access to a network share, network share? I don't understand. The images will be loaded by html pages with the img tag like in img src=http://domain.com/images/xxx.jpg; isn't the database the only choice ? No. It is one of the choices. The other is to store the images in the file system, in a directory readable by Apache. - or is there a postgresql function/utility that will server the file from the file system based on the reference/link embeded in the database ?? I think some procedure languages can read files. In this case what would be the gain in introducing a middle man, the db server? Regards, -- Clodoaldo Pinto Neto ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
Re: [GENERAL] Database versus filesystem for storing images
How does it make it easier to control access and security? If your web app makes a decision about allowing access to the database, it can just as easily make a decision about allowing access to the filesystem. Storing the images on the file system doesn't mean that there isn't a piece of code that determines whether or not users can access a particular file. I see security and access as a non-issue in making this decision. Either way, it's got to be controlled at the application level (if it is even necessary). On Fri, 05 Jan 2007 15:26:45 -0500, James Neff [EMAIL PROTECTED] said: ... and Moses said unto them, 'The eleventh commandment : thou shalt store images in a database!'... What if you had another database where you stored just the images and not back it up if you don't want to? As an application developer, I like the idea of storing files and images in the database because it makes it much easier to control access and security from an application standpoint. I think Microsoft SQL Server stores blobs in a separate file, and only retains pointers in the actually database field for that blob. So when you SELECT on that blob MS SQL reads the external file for you as if it lived in the database. I don't know if Postgres does the same thing, but if it did, you wouldn't have to worry about bloating database files. Sounds like this is for an Apache web application. Think about how web sites like Flickr and Webshots store their images in a database. You could write a cool Apache mod so that the url: http://mycompany.com/images/01234.jpg; would go through this module, pull the appropriate image from the database and send it back; all the while the client is none-the-wiser. Just a thought. I think its one of those things where there's not right or wrong answer. Instead you just have to do the minimum of what your application requires. If you don't need application-level control over the files, then by all means store them on the file system. But if you need to control security than you have to prevent physical access to the file (which means no file system storage) and pull the image from the database through the application. My two cents, James John McCawley wrote: Don't store your images in the database. Store them on the filesystem and store their path in the database. Anyone that tells you otherwise is a stark raving madman :) My system is very heavily used, and our pg_dump is only a few gigs. Meanwhile our images/documents storage is well over a hundred gigs. I'd hate to think that I'd have to dump and restore 100 gigs every time I wanted to dump the newest data to the development database. As far as how they actually get to the client machine, typically these days people use web servers for this sort of thing. Clodoaldo wrote: 5 Jan 2007 06:59:18 -0800, imageguy [EMAIL PROTECTED]: I think I know the answer, If you know the answer please tell it as I have read some discussions on the web and although I have decided on a solution I'm still not sure about the best answer, if there is a best answer after all. but if you don't have an application server - ie a webserver, etc, Yes I have an application server, the Apache server. and many of the workstations/clients that need access to the images but may not have access to a network share, network share? I don't understand. The images will be loaded by html pages with the img tag like in img src=http://domain.com/images/xxx.jpg; isn't the database the only choice ? No. It is one of the choices. The other is to store the images in the file system, in a directory readable by Apache. - or is there a postgresql function/utility that will server the file from the file system based on the reference/link embeded in the database ?? I think some procedure languages can read files. In this case what would be the gain in introducing a middle man, the db server? Regards, ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/ ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] vacuum v. vacuumdb
Nope - the other way around. The vacuumdb tool simply executes the VACUUM command through postmaster. On Fri, 05 Jan 2007 15:05:44 -0600, Erik Jones [EMAIL PROTECTED] said: Quick question, when running a VACUUM query through the postmaster, does it use the external vacuumdb tool? -- erik jones [EMAIL PROTECTED] software development emma(r) ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] Database versus filesystem for storing images
Yeah - it can make it easier to implement transactional semantics by storing them in the database, although for simple operations it wouldn't be hard to replicate this manually. And you are going to incur a performance penalty by storing them in the database. Another thing to consider is that storing them in the file system makes it much easier to browse the images using third-party tools, update them, archive them (by gzipping or whatever). This is much more difficult if they are stored in the database. On Fri, 5 Jan 2007 15:51:59 -0600, Bruno Wolff III [EMAIL PROTECTED] said: On Fri, Jan 05, 2007 at 15:26:45 -0500, James Neff [EMAIL PROTECTED] wrote: ... and Moses said unto them, 'The eleventh commandment : thou shalt store images in a database!'... What if you had another database where you stored just the images and not back it up if you don't want to? I think the main reason to keep images in the database is if you need transactional semantics. If you are updating images and transactions that started before the update, need to see the old version you are going to want them in the database. I suspect this need isn't very common though. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] Database versus filesystem for storing images
Referential integrity would be one positive for storing the files in the database. I wasn't responding to that. I'm simply saying that browsing them with third-party tools, updating, compressing/archiving, etc. is easier if they are not in the database. Those are all actions that I've found useful on other projects when storing user-uploaded images. Depending upon the number of images on disk, it might not be hard to verify that all the images referenced from the DB are there. You could have a small program that steps through each record and verifies its existence on disk. That could easily be run after a restore or as a batch-process. If you're talking about trillions of images - sure that'd be tough. If these images are extremely important - maybe that solution is unacceptable. But this is just a case of too little information to make a decision. There are pros and cons to both approach - in every project I've worked on that faced this decision, I felt the pros of storing it in the file system outweighed the pros of storing it in the DB. But there is no right or wrong answer to the question (unless of course you had special circumstances that made one option clearly superior - but I don't think we know enough details to make that call) My two cents, Jeremy Haile On Fri, 05 Jan 2007 20:24:05 -0200, Jorge Godoy [EMAIL PROTECTED] said: Jeremy Haile [EMAIL PROTECTED] writes: Another thing to consider is that storing them in the file system makes it much easier to browse the images using third-party tools, update them, archive them (by gzipping or whatever). This is much more difficult if they are stored in the database. If you touch the files with third-party tools how are you going to prevent that they aren't missing when the database say they are there? If you're referencing them somehow, you have to guarantee that they are there... Or am I the only one that is thinking about referential integrity with those files? -- Jorge Godoy [EMAIL PROTECTED] ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly