Re: [GENERAL] Limit on number of users in postgresql?

2007-01-29 Thread Jeremy Haile
 The more I think about it, the more I think a proxy app is necessary.  
 It seems like a lot of work just for security issues, but basically most 
 web based database apps use this model, with the web application acting 
 as a proxy between the database and the client.

This is how I've seen it done on almost every application I've worked
on.  If you have multiple apps hitting a single DB, usually each
application has it's own role.  But user-level security is controlled at
the application-level.   Although I don't think there's anything *wrong*
with having a role-per-user (it could provide an extra layer of
security), I think it's much more flexible to define security in the
application/business logic layer.  

That being said, we shouldn't get too wound up over this limitation of
PostgreSQL until someone finds that there really is some real-world
performance issue.  AFAIK, everything in this thread is theoretical.

Cheers,
Jeremy Haile

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] Stats collector frozen?

2007-01-27 Thread Jeremy Haile
Will do - thanks Magnus!  I'll test it for a while and post the results
here.  

Jeremy Haile

On Sat, 27 Jan 2007 17:21:23 +0100, Magnus Hagander
[EMAIL PROTECTED] said:
 Jeremy Haile wrote:
  Using standard build (none of the things you mentioned) on 8.2.1
  currently.
  
  I really appreciate it!
 
 Ok. I've built a binary for you at
 http://www.hagander.net/download/pgsql_8.2.1_win32select.zip
 
 Note that while I named the file 8.2.1 it really isn't - it's current
 head of the REL_8_2_STABLE branch. Which means it has a few other fixes
 as well.
 
 I think it *should* work to just replace the postgres.exe file and
 ignore the rest, so I didn't put them in the ZIP. But as always, make
 sure you take a backup first :-)
 
 //Magnus

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] Stats collector frozen?

2007-01-27 Thread Jeremy Haile
So far, the patched binary seems to have fixed the statistics issue in
my environment.  pgstat.stat is now updating constantly,
pg_stat_user_tables is updating correctly, and autovacuum is running!

Thanks for your quick response to this issue!  Let me know if there is
anything else I can do to assist.

Jeremy Haile


On Sat, 27 Jan 2007 11:23:39 -0500, Jeremy Haile [EMAIL PROTECTED]
said:
 Will do - thanks Magnus!  I'll test it for a while and post the results
 here.  
 
 Jeremy Haile
 
 On Sat, 27 Jan 2007 17:21:23 +0100, Magnus Hagander
 [EMAIL PROTECTED] said:
  Jeremy Haile wrote:
   Using standard build (none of the things you mentioned) on 8.2.1
   currently.
   
   I really appreciate it!
  
  Ok. I've built a binary for you at
  http://www.hagander.net/download/pgsql_8.2.1_win32select.zip
  
  Note that while I named the file 8.2.1 it really isn't - it's current
  head of the REL_8_2_STABLE branch. Which means it has a few other fixes
  as well.
  
  I think it *should* work to just replace the postgres.exe file and
  ignore the rest, so I didn't put them in the ZIP. But as always, make
  sure you take a backup first :-)
  
  //Magnus
 
 ---(end of broadcast)---
 TIP 2: Don't 'kill -9' the postmaster

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] Stats collector frozen?

2007-01-26 Thread Jeremy Haile
   Apparantly there is a bug lurking somewhere in pgwin32_select(). Because
   if I put a #undef select right before the select in pgstat.c, the
   regression tests pass. 
   
   May be, problem is related to fixed bug in pgwin32_waitforsinglesocket() ?
   WaitForMultipleObjectsEx might sleep indefinitely while waiting socket to 
   write, so, may be there is symmetrical problem with read? Or 
   pgwin32_select() is used for waiting write too?
  
  
  pgwin32_waitforsinglesocket() appears to work fine. And we only use
  FD_READ, so it's not affected by your patch from what I can tell.
  
  I've got it passnig tests with select replaced with waitforsinglesocket
  - now I just need to implement timeout in that one :-)
 
 Attached patch seems to solve the problem on my machine at least. Uses
 pgwin32_waitforsinglesocket() instead of pgwin32_select(). Changes
 pgwin32_waitforsinglesocket() to accept the timeout as a parameter (this
 is why it touches files outside of the stats area).

Magnus - thanks for your investigation and work!  Any chance I could get
a patched exe for win32 and test on my servers?

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [GENERAL] Stats collector frozen?

2007-01-26 Thread Jeremy Haile
 We have had lots of reports of issues with the stats collector on
 Windows. Some were definitly fixed by the patch by OT, but I don't
 think all.

Here were a couple of other reports I found:
http://archives.postgresql.org/pgsql-hackers/2006-09/msg00415.php
http://archives.postgresql.org/pgsql-hackers/2006-04/msg00127.php


 The thing is, since it didn't give any error messages at all, most users
 wouldn't notice. Other than their tables getting bloated, in which case
 they would do a manual vacuum and conlcude autovacuum wasn't good
 enough. Or something.

This is indeed what I assumed at first.  I started running vacuum
analyze hourly and turned off autovacuum.  Later, I decided to try the
autovacuum route again and investigated why it wasn't working well.

Magnus - could you send me a patched exe to try in my environment? 
Would it be compatible with 8.2.1?

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [GENERAL] [Fwd: [PORTS] M$ SQL server DTS package equivalent in

2007-01-26 Thread Jeremy Haile
I've also used Pentaho Data Integration (previously known as Kettle)
quite extensively, and can recommend it.  It supports many different
databases and has fairly good documentation (although thin in some
areas).  It has a GUI drag-and-drop tool that can be used to configure
transformations and is very flexible.  It also has an active community
that responds when you have issues.

I use it as part of a regular job that runs every 5 minutes and hourly
to copy and transform data from a SQL Server DB to a PostgreSQL DB.  I
use COPY when I can simply select data into a CSV and load it into
another DB - but as Tomi said, when you have to do primary key
generation, row merging, data cleanup, and data transformations - I
would use some sort of ETL tool over just SQL.

My 2 cents,
Jeremy Haile


On Fri, 26 Jan 2007 15:14:22 +, Tomi N/A [EMAIL PROTECTED] said:
  Besides being easy to schedule and very flexible, manipulating data
  with queries is extremely powerful and fairly easy to maintain
  assuming you know a little SQL -- thanks to postgresql's huge array of
  built in string manipulation functions.  Your skills learned here will
  pay off using the database as well for other things.
 
  Not only that, but this approach will be fast since it is declarative
  and handles entire tables at once as opposed to DTS-ish solutions
  which tend to do processing record by record.  Not to mention they are
  overcomplicated and tend to suck. (DTS does have the ability to read
  from any ODBC source which is nice...but that does not apply here).
 
 Different strokes for different folks, it seems.
 I'd argue that COPY followed by a barrage of plpgsql statements can't
 be used for anything but the most trivial data migration cases (where
 it's invaluable) where you have line-organized data input for a
 hand-full of tables at most.
 In my experience (which is probably very different from anyone
 else's), most real world situations include data from a number of very
 different sources, ranging from the simplest (.csv and, arguably,
 .xml) to the relatively complex (a couple of proprietary databases,
 lots of tables, on-the fly row merging, splitting or generating
 primary keys, date format problems and general pseudo-structured,
 messed up information).
 Once you've got your data in your target database (say, pgsql), using
 SQL to manipulate the data makes sense, but it is only the _final_
 step of an average, real world data transformation.
 
 Cheers,
 t.n.a.
 
 ---(end of broadcast)---
 TIP 4: Have you searched our list archives?
 
http://archives.postgresql.org/

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/


Re: [GENERAL] Can you specify the pg_xlog location from a config file?

2007-01-26 Thread Jeremy Haile
This utility is useful for creating junctions in Windows:
http://www.microsoft.com/technet/sysinternals/FileAndDisk/Junction.mspx

I am using this to symlink my pg_xlog directory to another disk and it
works great.

Jeremy Haile


On Fri, 26 Jan 2007 18:27:04 +, Roman Neuhauser
[EMAIL PROTECTED] said:
 # [EMAIL PROTECTED] / 2007-01-26 09:21:27 -0800:
  Windows doesn't support symlinks.  Is it possible instead for there to
  be a config file that lets one set where the pg_xlog directory will sit?
 
 Windows has junction points.
 
 -- 
 How many Vietnam vets does it take to screw in a light bulb?
 You don't know, man.  You don't KNOW.
 Cause you weren't THERE. http://bash.org/?255991
 
 ---(end of broadcast)---
 TIP 2: Don't 'kill -9' the postmaster

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [GENERAL] Stats collector frozen?

2007-01-26 Thread Jeremy Haile
 Applied to HEAD and 8.2 --- assuming the Windows buildfarm machines go
 green, we should probably consider back-porting this to 8.1 and 8.0.

Not trying to be a nuisance, but I'd really like to try this out in my
environment and see if my problems disappear.  Is there anyone out there
who could provide me with a patched exe for Win32?  If not, I could try
to get my system setup to build for Windows, but I'm not sure what all
that would involve.

Thanks,
Jeremy Haile

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] Stats collector frozen?

2007-01-26 Thread Jeremy Haile
Using standard build (none of the things you mentioned) on 8.2.1
currently.

I really appreciate it!


On Fri, 26 Jan 2007 21:24:09 +0100, Magnus Hagander
[EMAIL PROTECTED] said:
 Jeremy Haile wrote:
  Applied to HEAD and 8.2 --- assuming the Windows buildfarm machines go
  green, we should probably consider back-porting this to 8.1 and 8.0.
  
  Not trying to be a nuisance, but I'd really like to try this out in my
  environment and see if my problems disappear.  Is there anyone out there
  who could provide me with a patched exe for Win32?  If not, I could try
  to get my system setup to build for Windows, but I'm not sure what all
  that would involve.
 
 I'll see if I can build you something tomorrow. You're on 8.2, right? Do
 you use any features like Kerberos, SSL or NLS? I don't think I have
 them set up properly in my mingw build env, so it'd be easier if I could
 build without them.
 
 Unless beaten by someone who has a complete env ;-)
 
 //Magnus

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [GENERAL] Stats collector frozen?

2007-01-25 Thread Jeremy Haile
Tom,

Did this information shed any light on what the problem might be?  Any
solution or workaround?

Thanks!
Jeremy Haile

On Wed, 24 Jan 2007 14:19:05 -0500, Jeremy Haile [EMAIL PROTECTED]
said:
 pgstat.stat was last updated 1/22 12:25pm - there is no pgstat.tmp.  
 
 Coincidentally (I think not) - the last auto-analyze was performed at
 2007-01-22 12:24:11.424-05.
 
 The logs for 1/22 are empty - so no errors or anything like that to give
 clues...
 
 Thanks!
 Jeremy Haile
 
 
 On Wed, 24 Jan 2007 14:00:52 -0500, Tom Lane [EMAIL PROTECTED] said:
  Jeremy Haile [EMAIL PROTECTED] writes:
   The stats collector appears to still be running, since I can see a
   postgres.exe process with -forkcol.  However, I never notice it using
   I/O or CPU usage.  Also, querying the pg_stat_user_tables view shows no
   change in table stats even for tables that change very frequently.
  
  Is $PGDATA/global/pgstat.stat getting updated?  (watch the file mod time
  at least)  Do you see any pgstat.tmp file in there?
  
  regards, tom lane
 
 ---(end of broadcast)---
 TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [GENERAL] Stats collector frozen?

2007-01-25 Thread Jeremy Haile
Unfortunately I don't have any debugging tools installed that would work
against postgres - although I'd be glad to do something if you could
tell me the steps involved.  I can reproduce the issue quite easily on
two different Windows machines (one is XP, the other is 2003).

Please let me know if there is anything else I can do to help debug this
problem.  

Do you know of any workaround other than restarting the whole server? 
Can the collector be restarted individually? 

Thanks,
Jeremy Haile


On Thu, 25 Jan 2007 12:42:11 -0500, Tom Lane [EMAIL PROTECTED] said:
 Jeremy Haile [EMAIL PROTECTED] writes:
  Did this information shed any light on what the problem might be?
 
 It seems to buttress Magnus' theory that the intermittent (or not so
 intermittent) stats-test buildfarm failures we've been seeing have to
 do with the stats collector actually freezing up, rather than just
 not reacting fast enough as most of us (or me anyway) thought.  But
 why that is happening remains anyone's guess.  I don't suppose you
 have debugging tools that would let you get a stack trace from the
 collector process?
 
   regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] Stats collector frozen?

2007-01-25 Thread Jeremy Haile
 Then just pick it up in Task Manager or Process Explorer or whatever and
 kill it off. Just make sure you pick the right process.

I mentioned earlier that killing off the collector didn't work - however
I was wrong.  I just wasn't giving it enough time.  If I kill the
postgres.exe -forkcol process, it does gets restarted, although
sometimes it takes a minute.

Since it only seems to update pgstat.stat once after restarting, I'd
need to kill it once-a-minute to keep my statistics up to date =)  So,
unfortunately it's not a great workaround to my problem.


Jeremy Haile

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] Stats collector frozen?

2007-01-25 Thread Jeremy Haile
I'll try to put together a test case for hackers, although I'm not sure
what exactly causes it.  

Basically, when I fire up PostgreSQL - after about a minute the stats
collector runs once (pgstat.stat is updated, autovacuum fires up, etc.)
- and then the collector seems to hang.  If I watch it's performance
information, it does not read or write to disk again and pgstat.stat is
never updated again.  It never updates pgstat.stat more than once after
restart.  There are no errors in the log

I tried killing the collector a variety of ways on Windows, but it seems
to terminate indefinitely.  I don't see a kill program for windows that
lets me specify the signal to use.  So other than restarting PostgreSQL,
I'm not sure how to workaround this problem.

If anyone else is experiencing similar problems, please post your
situation.

On Thu, 25 Jan 2007 12:51:31 -0500, Tom Lane [EMAIL PROTECTED] said:
 Jeremy Haile [EMAIL PROTECTED] writes:
  Unfortunately I don't have any debugging tools installed that would work
  against postgres - although I'd be glad to do something if you could
  tell me the steps involved.  I can reproduce the issue quite easily on
  two different Windows machines (one is XP, the other is 2003).
 
 Sorry, I don't know anything about Windows debugging either.  Can you
 put together a test case that would let one of the Windows-using hackers
 reproduce it?
 
  Do you know of any workaround other than restarting the whole server? 
  Can the collector be restarted individually? 
 
 On Unix you can just 'kill -TERM' the collector process and the
 postmaster
 will start a new one without engaging in a database panic cycle.  Dunno
 what the equivalent is on Windows but it's probably possible.
 
   regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] Stats collector frozen?

2007-01-25 Thread Jeremy Haile
 AFAIR (Magnus can surely confirm) there were some other tables that
 weren't showing stats as all zeros -- but there's no way to know whether
 those numbers were put there before the collector had frozen (if
 that's really what's happening).

Yeah - I have numbers that updated before the stats collector started
freezing.  Do you know which version of PG this started with?  I have
upgraded 8.1.3, 8.1.4, 8.2, and 8.2.1 in the past months and I didn't
have the collector enabled until 8.2.1 - so I'm not sure how long this
has been a problem.  

I might try rolling back to a previous version - it's either that or
setup a scheduled vacuum analyze until we figure out this problem.  I'm
having to manually run it every day now... =)  I think this is a pretty
critical problem since it cripples autovacuum on Windows.

Are you guys in a position to debug the collector process and see where
it is freezing (ala Tom's earlier comment)?  Anything I can do to help
debug this problem faster?

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[GENERAL] Stats collector frozen?

2007-01-24 Thread Jeremy Haile
I've noticed that my tables are not being auto vacuumed or analyzed
regularly, even though I have very aggressive autovacuum settings.  

The stats collector appears to still be running, since I can see a
postgres.exe process with -forkcol.  However, I never notice it using
I/O or CPU usage.  Also, querying the pg_stat_user_tables view shows no
change in table stats even for tables that change very frequently.

I see error such as these in the log every now and then - not sure if
they are related.  These have been discussed at length in other posts
and seems to have something to do with PG holding onto old file handles
(Windows specific):
2007-01-24 06:24:16 ERROR:  could not open relation 1663/16404/333779:
Permission denied
2007-01-24 06:24:17 ERROR:  could not open relation 1663/16404/333779:
Permission denied
2007-01-24 06:24:18 ERROR:  could not open relation 1663/16404/333779:
Permission denied
2007-01-24 06:24:19 ERROR:  could not open relation 1663/16404/333779:
Permission denied

I'm running PG 8.2.1 on Windows.  Here is some of the output from show
all:
autovacuum;on
autovacuum_analyze_scale_factor;0.02
autovacuum_analyze_threshold;250
autovacuum_freeze_max_age;2
autovacuum_naptime;1min
autovacuum_vacuum_cost_delay;-1
autovacuum_vacuum_cost_limit;-1
autovacuum_vacuum_scale_factor;0.08
autovacuum_vacuum_threshold;500
stats_block_level;on
stats_command_string;on
stats_reset_on_server_start;off
stats_row_level;on
stats_start_collector;on
vacuum_cost_delay;20ms
vacuum_cost_limit;200
vacuum_cost_page_dirty;20
vacuum_cost_page_hit;1
vacuum_cost_page_miss;10
vacuum_freeze_min_age;1

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] Stats collector frozen?

2007-01-24 Thread Jeremy Haile
pgstat.stat was last updated 1/22 12:25pm - there is no pgstat.tmp.  

Coincidentally (I think not) - the last auto-analyze was performed at
2007-01-22 12:24:11.424-05.

The logs for 1/22 are empty - so no errors or anything like that to give
clues...

Thanks!
Jeremy Haile


On Wed, 24 Jan 2007 14:00:52 -0500, Tom Lane [EMAIL PROTECTED] said:
 Jeremy Haile [EMAIL PROTECTED] writes:
  The stats collector appears to still be running, since I can see a
  postgres.exe process with -forkcol.  However, I never notice it using
  I/O or CPU usage.  Also, querying the pg_stat_user_tables view shows no
  change in table stats even for tables that change very frequently.
 
 Is $PGDATA/global/pgstat.stat getting updated?  (watch the file mod time
 at least)  Do you see any pgstat.tmp file in there?
 
   regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [GENERAL] Stats collector frozen?

2007-01-24 Thread Jeremy Haile
Searching the archives, I found a couple of 2006 posts that seem
somewhat related to my problem (although I don't see any solutions
listed...):
http://archives.postgresql.org/pgsql-hackers/2006-09/msg00415.php
http://archives.postgresql.org/pgsql-hackers/2006-04/msg00127.php

Tom, since you were involved in these - did you ever figure out how to
resolve the issues of the stats collector getting stuck in Windows?

Thanks, Jeremy Haile

On Wed, 24 Jan 2007 14:19:05 -0500, Jeremy Haile [EMAIL PROTECTED]
said:
 pgstat.stat was last updated 1/22 12:25pm - there is no pgstat.tmp.  
 
 Coincidentally (I think not) - the last auto-analyze was performed at
 2007-01-22 12:24:11.424-05.
 
 The logs for 1/22 are empty - so no errors or anything like that to give
 clues...
 
 Thanks!
 Jeremy Haile
 
 
 On Wed, 24 Jan 2007 14:00:52 -0500, Tom Lane [EMAIL PROTECTED] said:
  Jeremy Haile [EMAIL PROTECTED] writes:
   The stats collector appears to still be running, since I can see a
   postgres.exe process with -forkcol.  However, I never notice it using
   I/O or CPU usage.  Also, querying the pg_stat_user_tables view shows no
   change in table stats even for tables that change very frequently.
  
  Is $PGDATA/global/pgstat.stat getting updated?  (watch the file mod time
  at least)  Do you see any pgstat.tmp file in there?
  
  regards, tom lane
 
 ---(end of broadcast)---
 TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [GENERAL] Postgresql.conf

2007-01-23 Thread Jeremy Haile
But there are ways that we could optimize count(*) queries for specific
circumstances right?  Obviously this isn't trivial, but I think it would
be nice if we could maintain a number of rows count that could be used
when performing a count(*) on the whole table (no where clause).   

I don't know if the overhead of keeping track of that number is worth
the benefits - but I know that querying for the number of rows in a
table is a common need and other RDBMSs do optimize for that special
case.

On Tue, 23 Jan 2007 12:53:43 -0600, Bruno Wolff III [EMAIL PROTECTED]
said:
 On Tue, Jan 23, 2007 at 10:12:13 -0500,
   Brandon Aiken [EMAIL PROTECTED] wrote:
  Out of curiosity, has the COUNT(*) with no WHERE clause slowness been
  fixed in 8.x?  Or is it still an issue of there's no solution that
  won't harm aggregates with WHERE clauses?
 
 Probably not in the sense that you mean.
 
 The underlying problem is that in MVCC there is no single global answer
 to the question and the pain of maintaining the mutliple answers
 outweighs
 the cost of doing so in normal usage.
 
 People that need to run count(*) queries a lot may want to make a
 different
 trade off and some ways of maintaining counts are covered in the
 archives.
 
 ---(end of broadcast)---
 TIP 3: Have you checked our extensive FAQ?
 
http://www.postgresql.org/docs/faq

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/


Re: [GENERAL] Index bloat of 4x

2007-01-19 Thread Jeremy Haile
Is it feasible to add a reindex concurrently that doesn't lock the
table for the rebuild, then locks the table when doing a second pass to
pickup rows that were changed after the first pass?  Or something like
that

On Fri, 19 Jan 2007 12:45:03 -0500, Tom Lane [EMAIL PROTECTED] said:
 Ed L. [EMAIL PROTECTED] writes:
  Online index creation definitely helps us toward 24x7.  But 
  wouldn't we still have to drop the old index, thus blocking 
  production queries?
 
 Yes, but only for a very short period.
 
   regards, tom lane
 
 ---(end of broadcast)---
 TIP 6: explain analyze is your friend

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] Multiple column index usage question

2007-01-19 Thread Jeremy Haile
That's interesting.  So if you have a composite index on two columns, is
there much of a reason (usually) to create single indexes on each of the
two columns?  I guess the single indexes might be slightly faster
depending on the number of different values/combinations, so probably
it depends eh?  


On Fri, 19 Jan 2007 16:57:42 -0600, Ron Johnson
[EMAIL PROTECTED] said:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1
 
 On 01/19/07 15:53, Jan Muszynski wrote:
  Rather simple question, of which I'm not sure of the answer.
  
  If I have a multiple column index, say:
  Index index1 on tableA (foo,bar)
  
  and I then:
  Select * from tableA where foo = some value
  
  Will index1 be used, or am I looking at a seqscan in all circumstances?
 
 Yes, it will use the index.
 
 However, in earlier versions, the lvalue  rvalue needed to match in
 type to use the index.
 
 -BEGIN PGP SIGNATURE-
 Version: GnuPG v1.4.6 (GNU/Linux)
 
 iD8DBQFFsUzmS9HxQb37XmcRArY8AKDqzS5FeY1HwkSGeOlhQsjsdpAV5gCghiWj
 R4e7rBWaAAGF25ZFhy1Elgc=
 =Wkp8
 -END PGP SIGNATURE-
 
 ---(end of broadcast)---
 TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [GENERAL] GUID/UUID Support

2007-01-18 Thread Jeremy Haile
I second the desire for a UUID type in PostgreSQL!  I'm aware of the
pguuid project, but it's not the same as having it in core and isn't
very well maintained.  

This is such a common database paradigm that it seems reasonable to
promote it to first-class citizen status in PostgreSQL.

I currently use varchars for UUIDs, but the table size, index size,
integrity (validation), and performance would be better if PostgreSQL
supported UUIDs directly.


On Thu, 18 Jan 2007 10:31:26 -0700, Patrick Earl [EMAIL PROTECTED]
said:
 One issue is that UUIDs are only 16 bytes of data.  To store the as
 text in canonical form requires 36 bytes.  As there are alternate
 frequently used representations, you also run into potential issues
 with input.  The GUID type (proposed by Gevik) handles those standard
 input variations.
 
 Though I haven't tried it, I would imagine there would be performance
 implications when using 36 character keys everywhere to do indexing,
 joins, etc.
 
 Another issue is that higher level languages (such as Delphi and .NET)
 have GUID field types built in.  If the field is just a string field,
 it won't map nicely to those higher level types.
 
  Patrick
 
 On 1/17/07, Chad Wagner [EMAIL PROTECTED] wrote:
  On 1/17/07, Patrick Earl [EMAIL PROTECTED] wrote:
   Certainly support for the GUID field type itself is most important.
   As for the generators, though they are non-essential, they are very
   useful.  Other platforms and libraries have standardized on uuid
   generators, so I don't see why PostgreSQL can't.
 
  Maybe I am oblivious to the reason, but why is there a need for a special
  data type for GUID/UUIDs?  Wouldn't you always be doing an equality
  anyways?  Wouldn't a varchar suffice?
 
  --
  Chad
  http://www.postgresqlforums.com/
 
 ---(end of broadcast)---
 TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] GUID/UUID Support

2007-01-18 Thread Jeremy Haile
 Yeah, but it's not going to be added to core until there's some
 agreement about *what* needs to be added. The point of the external
 project is that once it has acheived a level of support *then* it can
 be incorporated.

That's fair.  In truth, I only found that pguuid existed fairly recently
and haven't had a chance to try it out.  I've been scared away a bit
from using it do to threads I've read about problems and limitations
with it - also there are bug reports listed on the site which haven't
been updated in over a year.  Still - I'd like to give it a try myself
at some point.  

Has anyone here had a great experience using pguuid?

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/


[GENERAL] Diagnosing deadlock / connection hang

2007-01-17 Thread Jeremy Haile
I have a process that appears to hang every night.  I ran the following
query and results, and it looks like an autoanalyze and query are
waiting on a lock that's being exclusively held by a transaction that is
IDLE.  Any ideas?  Any additional queries I should run to shed light on
the issue?


SELECT pg_database.datname AS database, pg_class.relname AS table,
transaction, pid, mode, granted, usename, current_query, backend_start
FROM pg_locks, pg_class, pg_database, pg_stat_activity
WHERE pg_locks.relation = pg_class.oid
AND pg_locks.database = pg_database.oid
and pg_locks.pid = pg_stat_activity.procpid
order by pg_database.datname, pg_class.relname, granted

dashboard;activity_log_transition;18291984;1588;ShareUpdateExclusiveLock;f;postgres;ANALYZE
public.activity_log_transition;2007-01-17 00:06:03.281-05
dashboard;activity_log_transition;18291982;2872;AccessShareLock;f;dashboard;select
distinct strDescription from activity_log_transition;2007-01-17
00:05:03.281-05
dashboard;activity_log_transition;18291979;1472;ShareLock;t;dashboard;IDLE
in transaction;2007-01-17 00:05:00.968-05
dashboard;activity_log_transition;18291979;1472;AccessExclusiveLock;t;dashboard;IDLE
in transaction;2007-01-17 00:05:00.968-05
dashboard;activity_log_transition_pkey;18291979;1472;AccessExclusiveLock;t;dashboard;IDLE
in transaction;2007-01-17 00:05:00.968-05

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [GENERAL] insert only unique values in to a table, ignore rest?

2007-01-08 Thread Jeremy Haile
 Note that things will go faster if you do your initial data load using
 copy from stdin for the initial bulk data load.  individual inserts in
 postgresql are quite costly compared to mysql.  It's the transactional
 overhead.  by grouping them together you can make things much faster. 
 copy from stdin does all the inserts in one big transaction.  

You could do copy from file as well right?  (no performance difference
compared to copy from stdin)  I do this all the time.

Also - maybe I misunderstand something, but why does PostgreSQL's
implementation prohibit it from ignoring insert errors during a copy? 
If you added a unique constraint to the table before copying, PostgreSQL
would generate errors due to the unique constraint violation - so I
don't think any additional locking would be required for it to simply
say If there is an error while copying in, ignore it and continue
inserting other rows  

PostgreSQL's copy command doesn't currently support this, so the temp
table followed by a distinct select is the way to go.  But I didn't
follow all of the talk about it requiring locking the table and being
inherently impossible for PostgreSQL to support.

I've wanted a similar feature.  I select rows into a table on a regular
basis.  I'd like to be able to overlap old values and have PostgreSQL
ignore failed inserts.  SQL Server offers a flag that allows you to
ignore inserts whose primary key already exists in the table.  The only
solution in PostgreSQL is to run a query to manually delete the
duplicate rows from a temp table before inserting - which takes much
more time.

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] Database versus filesystem for storing images

2007-01-06 Thread Jeremy Haile
 I wonder why this HTTP cache headers argument didn't surface in this
 heated debate. 

I mentioned this earlier as well.  Although you could do it in the app
layer - it would be easier to just let the web server handle it.

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/


Re: [GENERAL] Database versus filesystem for storing images

2007-01-05 Thread Jeremy Haile
It's almost always better to store the images on the file system and
just store the filename or relative path in the database.  

This is more efficient, doesn't bloat the database by storing files in
it, and is easier to get proper browser caching behavior (depending on
how your app is setup).   I try to avoid BLOBs whenever possible.

Cheers,
Jeremy Haile
 

On Fri, 5 Jan 2007 17:18:10 -0200, Clodoaldo
[EMAIL PROTECTED] said:
 5 Jan 2007 06:59:18 -0800, imageguy [EMAIL PROTECTED]:
 
  I think I know the answer,
 
 If you know the answer please tell it as I have read some discussions
 on the web and although I have decided on a solution I'm still not
 sure about the best answer, if there is a best answer after all.
 
  but if you don't have an application
  server - ie a webserver, etc,
 
 Yes I have an application server, the Apache server.
 
  and many of the workstations/clients
  that need access to the images but may not have access to a network
  share,
 
 network share? I don't understand. The images will be loaded by html
 pages with the img tag like in img
 src=http://domain.com/images/xxx.jpg;
 
  isn't the database the only choice ?
 
 No. It is one of the choices. The other is to store the images in the
 file system, in a directory readable by Apache.
 
   - or is there a postgresql function/utility that will server the
  file from the file system based on the reference/link embeded in the
  database ??
 
 I think some procedure languages can read files. In this case what
 would be the gain in introducing a middle man, the db server?
 
 Regards,
 -- 
 Clodoaldo Pinto Neto
 
 ---(end of broadcast)---
 TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/


Re: [GENERAL] Database versus filesystem for storing images

2007-01-05 Thread Jeremy Haile
How does it make it easier to control access and security?  If your web
app makes a decision about allowing access to the database, it can just
as easily make a decision about allowing access to the filesystem. 
Storing the images on the file system doesn't mean that there isn't a
piece of code that determines whether or not users can access a
particular file.  

I see security and access as a non-issue in making this decision. 
Either way, it's got to be controlled at the application level (if it is
even necessary).



On Fri, 05 Jan 2007 15:26:45 -0500, James Neff
[EMAIL PROTECTED] said:
 ... and Moses said unto them, 'The eleventh commandment :  thou shalt 
 store images in a database!'...
 
 What if you had another database where you stored just the images and 
 not back it up if you don't want to?
 
 As an application developer, I like the idea of storing files and images 
 in the database because it makes it much easier to control access and 
 security from an application standpoint.
 
 I think Microsoft SQL Server stores blobs in a separate file, and only 
 retains pointers in the actually database field for that blob.  So when 
 you SELECT on that blob MS SQL reads the external file for you as if it 
 lived in the database.  I don't know if Postgres does the same thing, 
 but if it did, you wouldn't have to worry about bloating database
 files.
 
 Sounds like this is for an Apache web application.  Think about how web 
 sites like Flickr and Webshots store their images in a database.  You 
 could write a cool Apache mod so that the url:  
 http://mycompany.com/images/01234.jpg;  would go through this module, 
 pull the appropriate image from the database and send it back; all the 
 while the client is none-the-wiser.  Just a thought.
 
 I think its one of those things where there's not right or wrong 
 answer.  Instead you just have to do the minimum of what your 
 application requires.  If you don't need application-level control over 
 the files, then by all means store them on the file system.  But if you 
 need to control security than you have to prevent physical access to the 
 file (which means no file system storage) and pull the image from the 
 database through the application.
 
 My two cents,
 James
 
 
 
 John McCawley wrote:
  Don't store your images in the database.  Store them on the filesystem 
  and store their path in the database.  Anyone that tells you otherwise 
  is a stark raving madman :)
 
  My system is very heavily used, and our pg_dump is only a few gigs.  
  Meanwhile our images/documents storage is well over a hundred gigs.  
  I'd hate to think that I'd have to dump and restore 100 gigs every 
  time I wanted to dump the newest data to the development database.
 
 
  As far as how they actually get to the client machine, typically these 
  days people use web servers for this sort of thing.
  Clodoaldo wrote:
 
  5 Jan 2007 06:59:18 -0800, imageguy [EMAIL PROTECTED]:
 
 
  I think I know the answer,
 
 
  If you know the answer please tell it as I have read some discussions
  on the web and although I have decided on a solution I'm still not
  sure about the best answer, if there is a best answer after all.
 
  but if you don't have an application
  server - ie a webserver, etc,
 
 
  Yes I have an application server, the Apache server.
 
  and many of the workstations/clients
  that need access to the images but may not have access to a network
  share,
 
 
  network share? I don't understand. The images will be loaded by html
  pages with the img tag like in img
  src=http://domain.com/images/xxx.jpg;
 
  isn't the database the only choice ?
 
 
  No. It is one of the choices. The other is to store the images in the
  file system, in a directory readable by Apache.
 
   - or is there a postgresql function/utility that will server the
  file from the file system based on the reference/link embeded in the
  database ??
 
 
  I think some procedure languages can read files. In this case what
  would be the gain in introducing a middle man, the db server?
 
  Regards,
 
 
  ---(end of broadcast)---
  TIP 4: Have you searched our list archives?
 
http://archives.postgresql.org/
 
 
 
 ---(end of broadcast)---
 TIP 3: Have you checked our extensive FAQ?
 
http://www.postgresql.org/docs/faq

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] vacuum v. vacuumdb

2007-01-05 Thread Jeremy Haile
Nope - the other way around.  The vacuumdb tool simply executes the
VACUUM command through postmaster.


On Fri, 05 Jan 2007 15:05:44 -0600, Erik Jones [EMAIL PROTECTED] said:
 Quick question,  when running a VACUUM query through the postmaster, 
 does it use the external vacuumdb tool?
 
 -- 
 erik jones [EMAIL PROTECTED]
 software development
 emma(r)
 
 
 ---(end of broadcast)---
 TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [GENERAL] Database versus filesystem for storing images

2007-01-05 Thread Jeremy Haile
Yeah - it can make it easier to implement transactional semantics by
storing them in the database, although for simple operations it wouldn't
be hard to replicate this manually.  And you are going to incur a
performance penalty by storing them in the database.  

Another thing to consider is that storing them in the file system makes
it much easier to browse the images using third-party tools, update
them, archive them (by gzipping or whatever).  This is much more
difficult if they are stored in the database.


On Fri, 5 Jan 2007 15:51:59 -0600, Bruno Wolff III [EMAIL PROTECTED]
said:
 On Fri, Jan 05, 2007 at 15:26:45 -0500,
   James Neff [EMAIL PROTECTED] wrote:
  ... and Moses said unto them, 'The eleventh commandment :  thou shalt 
  store images in a database!'...
  
  What if you had another database where you stored just the images and 
  not back it up if you don't want to?
 
 I think the main reason to keep images in the database is if you need
 transactional semantics. If you are updating images and transactions that
 started before the update, need to see the old version you are going to
 want them in the database. I suspect this need isn't very common though.
 
 ---(end of broadcast)---
 TIP 3: Have you checked our extensive FAQ?
 
http://www.postgresql.org/docs/faq

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [GENERAL] Database versus filesystem for storing images

2007-01-05 Thread Jeremy Haile
Referential integrity would be one positive for storing the files in the
database.  I wasn't responding to that.  I'm simply saying that browsing
them with third-party tools, updating, compressing/archiving, etc. is
easier if they are not in the database.  Those are all actions that I've
found useful on other projects when storing user-uploaded images.

Depending upon the number of images on disk, it might not be hard to
verify that all the images referenced from the DB are there.  You could
have a small program that steps through each record and verifies its
existence on disk.  That could easily be run after a restore or as a
batch-process.  If you're talking about trillions of images - sure
that'd be tough.  If these images are extremely important - maybe that
solution is unacceptable.  But this is just a case of too little
information to make a decision.  

There are pros and cons to both approach - in every project I've worked
on that faced this decision, I felt the pros of storing it in the file
system outweighed the pros of storing it in the DB.  But there is no
right or wrong answer to the question (unless of course you had special
circumstances that made one option clearly superior - but I don't think
we know enough details to make that call)

My two cents,
Jeremy Haile

On Fri, 05 Jan 2007 20:24:05 -0200, Jorge Godoy [EMAIL PROTECTED]
said:
 Jeremy Haile [EMAIL PROTECTED] writes:
 
  Another thing to consider is that storing them in the file system makes
  it much easier to browse the images using third-party tools, update
  them, archive them (by gzipping or whatever).  This is much more
  difficult if they are stored in the database.
 
 If you touch the files with third-party tools how are you going to
 prevent
 that they aren't missing when the database say they are there?  If you're
 referencing them somehow, you have to guarantee that they are there... 
 Or am
 I the only one that is thinking about referential integrity with those
 files? 
 
 -- 
 Jorge Godoy  [EMAIL PROTECTED]

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly