Re: [HACKERS] Idea about better configuration options for sort memory

2004-05-30 Thread scott.marlowe
On 12 Feb 2004, Greg Stark wrote:

 Tom Lane [EMAIL PROTECTED] writes:
 
   Hmmm ... maybe query_work_mem and maintenance_work_mem, or something 
   similar?
  
  I'll go with these unless someone has another proposal ...
 
 dml_sort_mem and ddl_sort_mem ?

I like those.  Are they an accurte representation of what's going on?  If 
so, I'd go with these, as they are more easily recognizable by folks 
who've worked with dbs for a while.  On the other hand, they're probably 
less recognizable to the newbies.


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] PITR Dead horse?

2004-05-13 Thread scott.marlowe
On Thu, 5 Feb 2004, Rod Taylor wrote:

   Don't know. But apparently different users will have 
   different demands From a database.
  
  Of course, but I would argue that my claim that PostgreSQL is reliable
  is backed up by the lack of people posting messages like 'we had a
  powercut and now my DB is hosed'.
 
 One thing we could use (and I have no idea how to do it) is a This
 hardware is not appropriate for a database test kit.
 
 Something to detect lying disks, battery backed write cache that isn't
 so battery backed, etc.

but I'm not sure you can test that without power off tests...  so, it 
would have to be a test that kinda started up then told you to pull the 
plug on the box.  Even a kernel panic wouldn't detect it because the drive 
would still be powered up.

Or, you could have a test that checked what kind of drive it was (IDE 
versus SCSI) and maybe had a table of drives that are known to lie, 
possibly even by version, should drives of the same model stop lying half 
way through production due to fixes in their firmware.

I'd guess it the table would still have to be built the old fashioned way, 
by doing power off tests.


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] ALTER TABLE TODO items

2004-05-06 Thread scott.marlowe
On Thu, 6 May 2004, Richard Huxton wrote:

 Bruce Momjian wrote:
  Tom Lane wrote:
  
 Richard Huxton [EMAIL PROTECTED] writes:
 
 Does that mean I'll want to disable triggers while I do this?
 
 Hrm.  Right now the code does not fire triggers at all, but that seems
 wrong.  However, I doubt that very many triggers could cope with update
 events in which the old and new rows have different rowtypes :-(.
 Any thoughts what to do about that?
  
  
  If triggers exist, I think we should just throw a warning that triggers
  will not be fired.
 
 Tom's point about triggers probably not working after the upgrade is a 
 good one though. Is it reasonable to just refuse to act on a table until 
   all triggers are dropped? I'd rather be forced to go through and 
 drop/restore triggers in a script than be surprised later on.

How about cascade drop triggers as an option so you can still do it in 
one line should you want to?


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] PostgreSQL pre-fork speedup

2004-05-06 Thread scott.marlowe
On Thu, 6 May 2004, Tom Lane wrote:

 sdv mailer [EMAIL PROTECTED] writes:
  The point is pre-forking can *potentially* speed up
  connections by 5x as shown in this simplistic
  non-conclusive benchmark.
 
 I think this benchmark proves no such thing.
 
 The thing that pgpool is doing is not preforking connections at all, but
 re-using prior connections.  The important difference is that you are
 using a hot backend that has already loaded a full working set of
 relcache and syscache entries --- and not just any old entries, but
 exactly those needed to process your query.  (The fact that the pgbench
 test uses only a very limited set of queries probably causes this test
 to overstate the effect compared to more realistic workloads.)
 
 The profiling that I've done of backend startup shows that cache
 initialization accounts for the bulk of the startup delay.  And IIRC,
 I was just measuring the time needed to be ready to accept the first
 query, not the additional effort to fetch query-specific cache entries.
 So having a hot backend would make a significant difference, but merely
 avoiding the fork wouldn't necessarily.

Wouldn't the db selection / authentication be more / as expensive as 
buffer creation?  Even in trust mode the backend still has to 
authenticate it just doesn't have to do as much to do that as with 
passwords.  I'd expect that to be a big chunk of time too.

It appears the best place to fix this problem (not a problem with 
postgresql, but an engineering problem in an abstract sense) is with 
pooling, and once the flushing etc... in tatsuo's code is fixed up to be 
zippy, pgpool would be THE answer for such issues.


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] PostgreSQL pre-fork speedup

2004-05-05 Thread scott.marlowe
On Wed, 5 May 2004, sdv mailer wrote:

 Forking is quite fast on Linux but creating a new
 process is still 10x more expensive than creating a
 thread and is even worse on Win32 platform. CPU load
 goes up because the OS needs to allocate/deallocate
 memory making it difficult to get a steady state
 resource consumption.

Just a nit to pick here.  In Linux, the difference between forking and 
spawning a new thread is almost nothing.  Definitely less than a factor of 
2, and most assuredly less than the quoted factor of 10 here.

The fact that windows has a heavy process / lightweight thread design 
means little to me, since I'll likely never deploy a production postgresql 
server on it that needs to handle any serious load.


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS] PostgreSQL pre-fork speedup

2004-05-05 Thread scott.marlowe
On Wed, 5 May 2004, Rod Taylor wrote:

  And, of course, most development environments (perl, php, java etc)
  have their own language specific connection pooling solutions.
 
 Yes, the one for php is what I was thinking of when I made my statement.
 They work on a per backend basis as Apache does not allow for the type
 of communication between processes that would otherwise be required. A
 connection created by Apache backend A cannot be used by Apache backend
 B.
 
 Java is an example where it is done well, but the language decision was
 made long before I joined the firm.
 
 I cannot tell if mod_pg_pool works across Apache forked backends or is
 still bound to a single process. They state it is intended for sharing
 connections across modules, so it is probably still backend specific.

Have you looked at sqlrealy.sourceforge.net?  IT looks like it might do 
what you need.


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


[HACKERS] The features I'm waiting for.

2004-05-04 Thread scott.marlowe
For me, the only features I'm likely to use in the upcoming releases are 
nested transactions.  While PITR is a great selling point, and the Windows 
Port is something I do look forward to, having to do half my job 
programming windows boxes, nested transactions are a feature I can 
genuinely use in my daily (maybe weekly??? :-) life.

While a focus on things that make postgresql more market acceptable are 
important, the things that make it more feature complete to me as a user 
are the things I'd gladly wait an extra month or two for.  

But I'm not programming any of the code, so I'm just sayin'...


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] The features I'm waiting for.

2004-05-04 Thread scott.marlowe
On Tue, 4 May 2004, David Garamond wrote:

 scott.marlowe wrote:
  For me, the only features I'm likely to use in the upcoming releases are 
  nested transactions.  While PITR is a great selling point, and the Windows 
  Port is something I do look forward to, having to do half my job 
  programming windows boxes, nested transactions are a feature I can 
  genuinely use in my daily (maybe weekly??? :-) life.
  
  While a focus on things that make postgresql more market acceptable are 
  important, the things that make it more feature complete to me as a user 
  are the things I'd gladly wait an extra month or two for.  
  
  But I'm not programming any of the code, so I'm just sayin'...
 
 I'm sure everybody has their own favorite feature. But I can say quite 
 confidently that the upcoming release contains the most number of highly 
 anticipated features ever. Nested transaction, 2-phase commit, Windows 
 port... I mean these are all major stuffs. They are paving the way of 
 deployments of Postgres in new areas and applications. Plus don't forget 
 all the other sweet goodies like autovacuum and PITR.
 
 But the next release could also be the buggies version ever, due to the 
 number of these new features. :-)

Agreed.  My only real point was that userland features are the reason I 
upgrade.  Simple stuff like the inclusion of regex in the substring 
function in version 7.3 or 7.4 was awesome.  It was small and simple, and 
made my life MUCH easier.  

And while I might not be hacking the code, I'm quite willing to be a beta 
tester.  :-0  

So, thanks to all the hackers, for doing such a great job.


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: How to Welcome Windows Users (was Re: [HACKERS] 7.5 features)

2004-04-28 Thread scott.marlowe
On Tue, 27 Apr 2004, Richard Huxton wrote:

 On Tuesday 27 April 2004 14:27, Bruce Momjian wrote:
  Here are features that are being worked on, hopefully for 7.5:
 
  o  tablespaces (Gavin)
  o  nested transactions (Alvaro)
  o  two-phase commit (Heikki Linnakangas)
  o  integrated pg_autovacuum (O'Connor)
  o  PITR (Riggs)
  o  Win32 (Claudio, Magnus)
 
  If we get the majority of them, and I think we will, this will be a
  great release.
 
 Sounds like the biggest release since 7.0 to me, and all good stuff. I do have 
 a nagging concern with the Windows support though. I'm guessing most people 
 running Windows servers will either be running php on top, or have windows 
 clients. AFAIK this means .NET or ODBC, and for older Access-based systems 
 upgrading definitely ODBC.
 
 Dave Page has bravely stepped into the breach to maintain the ODBC driver, but 
 the niggles in it will generate a flood of support messages as Windows users 
 test it out. Basically, I'm asking what would need to be done technically for 
 the ODBC driver, and is there anything a non-hacker can do to help?

I would say the OLE-DB driver would be nice to have ready to go.  There 
are apparently a few projects on source forge to make one that are making 
good progress, and it would be nice to have a fairly workable solution 
about the time 7.5 rolls out.


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] [pgsql-advocacy] What can we learn from MySQL?

2004-04-28 Thread scott.marlowe
On Tue, 27 Apr 2004, Andrew Payne wrote:

 
 Scott Marlowe wrote:
 
  While Apache is and has been wildly popular for bulk hosing and domain
  parking, for serious commercial use, Netscape's enterprise server, now Sun
  One, has long been a leader in commercial web sites.
 
 Netscrape/SunONE may have been a leader in some sub-market, but this misses
 the point.

Not A submarket, THE submarket, enterprise class application server, i.e. 
web commerce and such.  Just because apache hosts hundreds of thousands of 
personal web sites with all static content does not make it a market 
leader.  When it came to commercial usage, apache still had to fight its 
way to the top.

 Apache + NCSA never had less than 50% market share, overall.
 
   http://news.netcraft.com/archives/web_server_survey.html

Again, if 98% of those sites are personal web sites with static content, 
(they certainly were until a few years ago) and you remove those from the 
counting, then you find out that in enterprise class web servers, apache 
had sound competition it is only now starting to consume.

 Postgres is in a completely different situation:  95+?% of the world's
 databases don't run on Postgres, and it's been this way for a long time.

and some large percentage of the worlds app servers were running on 
something other than apache for quite some time too.

If postgresql was ubiquitous as the database of choice for simple access 
type applications, it would still have to earn its stripes in the 
enterprise one at a time.

 My point:  Apache was successful in a situation that may not apply here.

I agree that the situations aren't the exact same, but they're more 
similar than most people realize.  Apache was never a market leader in the 
enterprise realm until fairly late in the 1.3.x series releases.

 Does anyone know of an open source project that *has* successfully displaced
 a market of mature, established products WITHOUT a commercial entity
 providing marketing, support  direction?

gcc?


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] Bringing PostgreSQL torwards the standard regarding

2004-04-27 Thread scott.marlowe
On Mon, 26 Apr 2004, Josh Berkus wrote:

 Shachar,
 
  I think the concensus was that the runtime part was aprox. four lines 
  where the case folding currently takes place. Obviously, you would have 
  to get a var, and propogate that var to that place, but not actually 
  change program flow.
 
 That's only if you ignore the system catalogs entirely, which maybe you're 
 prepared to do.  If you want to change case folding for the system catalogs, 
 though, you'll need to update code in thousands of places, becuase the 
 back-end code is expecting lower-case identifiers 

As someone who has discussed this with Tom in the past, I seem to remember 
that there were major issues with handling the system catalogs, because 
internally, the backends treat the identifiers as if they have already 
been quoted.

I think the answer to all of this would require a lot of code being 
touched to either make it case fold, costing performance, or the 
replacement of the default lower cased catalog with upper cased catalog.

i.e. no simple switch setting, but an initdb option that would be set like 
locale currently is, for the life of the cluster.

A more comprehensive solution, one which allowed switching from upper 
folding to lower folding to no folding, to case insensitive, or some 
subset of those possibilities results in 

a:  slower backend performance, due to folding case for system catalogs
b:  touching a helluva lot of backend code to make it possible to fold up 
or down.

I'm not 100% sure on this all, but that seems to be the point Tom and I 
came to in our discussion, and neither of the two solutions seemed very 
good at the time.


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] Usability, MySQL, Postgresql.org, gborg, contrib, etc.

2004-04-27 Thread scott.marlowe
On Mon, 26 Apr 2004, Josh Berkus wrote:

  I think that a talented manager could make the case for certain features.
 
 So?   So could any community member with a good grasp of database engineering 
 and an ability to write persuasive e-mails.

I'd like to inject here that I was the one who started the whole argument 
over date style enforcement about a year ago.  It was a bit of a slog at 
first, but I felt my input was good, and the cause was just :-).

Because of that discussion, PostgreSQL accpeting dates like

2004-31-03 and turning them into 2004-03-31 on the fly stopped.  as did 
postgresql accepting a date like 03/31/2004 and turning it into 31/03/2004 
when it was set to accept Euro style dates.

I am not a PostgreSQL hacker, just a user.  My input made the difference.  
No title needed.  What PostgreSQL needs more of is caring users who can 
spot bugs / misbehaviours, make a decent argument about it, and maybe even 
a patch or two.  We're way more short on indians than chiefs in my 
opinion.


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Usability, MySQL, Postgresql.org, gborg, contrib, etc.

2004-04-27 Thread scott.marlowe
On Tue, 27 Apr 2004, Jochem van Dieten wrote:

 [EMAIL PROTECTED] wrote:
  (5) Programming languages. We need to make a programming language standard
  in PostgreSQL. plpgsql is good, but isn't someone working on a Java
  language. That would be pretty slick.
 
 IMHO SQL/PSM would be the obvious choice for the standard 
 procedural language. Not only because it is part of the SQL 
 standard (ISO/IEC 9075-4:2003), but also because it is reasonably 
 intuitive and it is the same as what is implemented in DB2 (and 
 in MySQL 5 due sometime not so soon).
 The only problem is that there is no PostgreSQL implementation.

While I can see heading in that direction, the lack of an implementation 
makes this suggestion impractical.  It will take time not just to 
implement it, but to test it and debug it, and for it to reach maturity.

PL/pgsql is mature and tested, has a great deal of code already written 
for it, and has reached maturity.  I'd say including it by default 
represents little or no security risk, and increases the value, out of the 
box, of postgresql for most folks while costing very little in terms of 
wasted disk space etc...


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS] [pgsql-advocacy] What can we learn from MySQL?

2004-04-27 Thread scott.marlowe
On Mon, 26 Apr 2004, Andrew Payne wrote:

 
 Bruce asked an excellent question:
 
  My question is, What can we learn from MySQL?  I don't know there is
  anything, but I think it makes sense to ask the question.
 
 After watching the traffic on this, the biggest MySQL lesson has gone
 largely unmentioned:  that a well-funded, well-marketed, focused commercial
 entity clearly associated with the project can do wonders to overcome
 feature and technical shortcomings.
 
 At some point (probably there now), I think the lack of a Postgres, Inc.
 is going to hinder adoption.  Companies want to 'buy' from vendors that look
 like real, viable companies, and provide them products with support,
 training, features, and direction.  With MySQL, you get one stop shopping.
 With Postgres, you've got to find and assemble the parts yourself.  Most
 CIOs stop there, and start waiting for MySQL to get better before switching
 from Oracle.

I'm gonna disagree here.  I think that not having a postgresql inc to go 
to means that by the time postgresql becomes ubiquitous, it will be like 
apache.  no company behind it, every company using it.  I.e. we'll earn 
our stripes one at a time by proving we're the better database for 95% of 
all purposes, and anyone not using postgresql will be behind the power 
curve and doing themselves no favor.  like CIO's who call Open Source 
Shareware and believe that .net provides for a more efficient 
programming environment, people who poo poo postgresql will find 
themselves behind the 8 ball in the long run.  No need for a postgresql 
inc to do that, just time, good code, and knowledgable DBAs choosing it 
more and more often.


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] Is there any method to keep table in memory at startup

2004-04-27 Thread scott.marlowe
On Wed, 21 Apr 2004 [EMAIL PROTECTED] wrote:

 Hi
 I am working on a project in postgres..in which i designed customized data type
 and operations on it.it requires a look up table..
 I have three options regarding this table...
 1. Every time a query is executed it creates table assigns values and after
 execution destroys it...which is overhead..
 
 2. store table on disk in database and access it whenever required but it
 degrades the performance
 
 3. whenever psql starts it can load the table in memory from database which is
 efficient way to do

PostgreSQL has no facility to put tables into memory.

Assuming this lookup table will be hit quite often, it WILL be in memory 
for selects.  updates / deletes / inserts will have to get flushed out to 
disk of course.

the Linux and BSD kernels are both quite good at keeping commonly used 
data in memory.  I think you are mistaken in assuming that an on disk 
table will be significantly slower than if it was fixed in memory due to 
the very efficient cachine of the most common unix kernels.


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] [pgsql-advocacy] What can we learn from MySQL?

2004-04-27 Thread scott.marlowe
On Mon, 26 Apr 2004, Andrew Payne wrote:

 For those that look to Apache:  Apache never had a well-established
 incumbent (Oracle), an a well-funded upstart competitor (MySQL).  Rob
 McCool's NCSA httpd (and later, Apache) were good enough and developed
 rapidly enough that they prevented any other HTTP server projects from
 getting critical mass.

This is a followup to my previous message where I mentioned apache, but 
did not really followup on it.

While Apache is and has been wildly popular for bulk hosing and domain 
parking, for serious commercial use, Netscape's enterprise server, now Sun 
One, has long been a leader in commercial web sites.  That has now changed 
too.  While Netscape's server was pretty good, it is simply harder to 
configure, not as versatile as apache, and not as reliable or as fast 
nowadays.  This was not always the case.  There was a time when its 
performance was considered to be much better than apache (I'm thinking 
about apache 1.3.4 or so) and apache configuration was a black art few 
understood.  with modern gui tools for configuring apache, and the 
incredible performance gains the late model 1.3 versions and now 2.0.x 
versions have, it is quickly displacing the more expensive netscape.

Apache did not start in first place when it comes to enterprise class 
web servers, no matter how many small personal web sites ran on it.  Most 
commercial companies didn't use it at first.  It too had to earn its 
stripes over time and by proving it was better.  Now I know people who 
think Open Source is just so much pie in the sky hand waving philosophical 
candy who think apache and jboss are the bomb.  they'll come around on 
PostgreSQL too, once someone with some foresight points out the advantages 
it has.  and one of its advantages is that it doesn't have a large 
monolithic organization driving development.


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] TPC H data

2004-04-27 Thread scott.marlowe
On Wed, 21 Apr 2004, Shalu Gupta wrote:

 Hello,
 
 We are trying to import the TPC-H data into postgresql using the COPY
 command and for the larger files we get an error due to insufficient
 memory space.
 
 We are using a linux system with Postgresql-7.3.4
 
 Is it that Postgresql cannot handle such large files or is there some
 other possible reason.

what method(s) are you using to load the data?


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [HACKERS] [pgsql-advocacy] What can we learn from MySQL?

2004-04-23 Thread scott.marlowe
On Fri, 23 Apr 2004, Bruce Momjian wrote:

 Here is a blog about a recent MySQL conference with title, Why MySQL
 Grew So Fast:
 
   http://www.oreillynet.com/pub/wlg/4715
 
 and a a Slashdot discussion about it:
 
   
 http://developers.slashdot.org/article.pl?sid=04/04/20/2229212mode=nestedtid=137tid=185tid=187tid=198
 
 My question is, What can we learn from MySQL?  I don't know there is
 anything, but I think it makes sense to ask the question.

My immediate rhetorical response is What could the Tortoise learn from 
the Hare?

I think we all know which is which in my question.

 Questions I have are:
 
   o  Are we marketing ourselves properly?

I'm never sure about this.  I think the best marketing is experienced 
users selling pg to their bosses one at a time.  While our MSSQL servers 
at work have died under load innumerable times, our small collection of 
postgresql servers (one's so old and embedded it's running 6.4) have been 
very reliable.  So, slowly but surely, PostgreSQL is proving itself here.

   o  Are we focused enough on ease-of-use issues?

Enough for me, but I don't think databases should necessarily be all that 
easy to use by people who don't understand basic relational theory.  So 
for me, ease of use means things like transactable DDL and well indexed, 
well written documentation.  I suspect ease of use for my boss is 
something entirely differnt, and may have to do with why he bought the EMS 
postgresql manager packages he did.

   o  How do we position ourselves against a database that some
  say is good enough (MySQL), and another one that some
  say is too much  (Oracle)

Hey, we're like the porridge in goldilock's, just right... :-)

dba folks don't pick MySQL, because it's so limited and basically has 
so many bugs (it's a feature that we don't bounds check data!)  And it's 
pretty easy to get an Oracle guy to play with postgresql when you show him 
things like transactionable DDL.

   o  Are our priorities too technically driven?

I don't think so.  But I'm a database / coder geek.  :-)


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] contrib vs. gborg/pgfoundry for replication solutions

2004-04-21 Thread scott.marlowe
I almost agree, but I think things that are being actively developed to 
eventually move into the backend, like autovacuum or slony-I should be in 
contrib.  Things that aren't destined for backend integration should be 
removed though, like pgbench or dblink or whatnot.

On Wed, 21 Apr 2004, Joshua D. Drake wrote:

 Hello,
 
 My personal opinion is that contrib should be removed entirely. Just 
 have a contrib.txt that says all contrib modules are at pgfoundry or 
 whatever.
 
 Sincerely,
 
 Joshua D. Drake
 
 
 Jan Wieck wrote:
 
  Taking into account that quite a few people have repeatedly stated that 
  the components in contrib are considered more supported/recommended than 
  similar solutions found on gborg or any other external site, I suggest 
  we move the projects dbmirror and dblink to gborg. The rserv contrib 
  module seems to me to be an early Perl prototype of erserver, nobody is 
  working on any more. I suggest we drop that entirely.
  
  Comments/alternatives?
  
  
  Jan
  
 
 ---(end of broadcast)---
 TIP 8: explain analyze is your friend
 


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [HACKERS] pg_encoding not needed anymore

2004-04-20 Thread scott.marlowe
On Tue, 20 Apr 2004, Bruce Momjian wrote:

 Andrew Dunstan wrote:
  past.  I think createuser is much worse.  :-)
  

  
  Agreed. Actually, the big problem with the name initdb is that the 
  name is misleading, and newbies often get confused by it. You are 
  preparing a data store for many databases, not a single database.  But I 
  think it is far too sanctified by history to change now, just as Ken 
  Thompson now wishes he had put an 'e' on the end of 'creat' but can't go 
  back and fix it. Maybe we should think about a symlink/hardlink to use a 
  better name.
 
 Yea, initcluster would have been better, but cluster confuses with
 CLUSTER, just like database schema confuses with CREATE SCHEMA.

Maybe initpg or pg_init or something like that?


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS] Remove MySQL Tools from Source?

2004-04-16 Thread scott.marlowe
On Fri, 16 Apr 2004, Christopher Kings-Lynne wrote:

  I always ran one of the 2 scripts (can't remember which one) and after that 
  started checking the dump file, because there were things that didn't get 
  changed correctly[1].
  
  [1]: I always remember the first conversion I did. I found out that MySQL 
  accepted dates like 30/2/2000 or 0-0-.
  Very odd.
 
 Yes, MySQL has always accepted those as perfectly valid dates.  It's 
 quite broken.

the sad thing is that while MySQL implemented a -ansi switch that 
supposedly turns on ansi compliance, it only fixes about 1/2 of all the 
non-compliance issues.  Yet another half-implemented feature... :)


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] make == as = ?

2004-04-08 Thread scott.marlowe
On Thu, 8 Apr 2004, Bruce Momjian wrote:

 Fabien COELHO wrote:
  
This would help me, at least, write correct and portable SQL. :)
  
   Added to TODO:
  
 * Add a session mode to warn about non-standard SQL usage
  
  So it seems that having C-like operators would hurt a lot;-)
  
  So you want to generate warnings for SERIAL, TEXT and a bunch of other
  types, WITH[OUT] OIDS, RULE, ~ regular expressions, arrays, any use of
  pg_catalog instead of information_schema (I may be wrong in the list, I
  don't have the standard at hand, but I think I'm right in the spirit)...
  
  This is going to be noisy;-)
 
 Yep, it sure is going to be noisy.

Could we consider a three (or more) way setting, for what to do?  
Something like:

sql_noncompliance_mode = error;
sql_noncompliance_mode = warn / notice;
sql_noncompliance_mode = ignore;

Just wondering...


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] pg_dump and INCREMENT BY

2004-04-07 Thread scott.marlowe
On Wed, 7 Apr 2004, Dennis Bjorklund wrote:

 On Wed, 7 Apr 2004, Dennis Bjorklund wrote:
 
 Replying to myself here :-)
 
  wants to import it into a 7.3 database. Use the 7.3 dump you might say, 
  but since BY does not do anything why not remove it from the dump output?
 
 I just realized there is yet another new construct in 7.4. The sequences
 are created with NO MAXVALUE (and MINVALUE). Couldn't these just be
 removed in the pg_dump output in this case. It's the default anyway,
 right.
 
 I just imported a fairly big dump from a 7.4 database into a 7.3 database 
 and the sequences was the only thing I had to change. Of course the 
 database in itself was designed on 7.3 so it didn't use any 7.4 features.
 
 It's not a big issue, it's just that it seems simple to dump SQL code (for
 sequences) that work the same on both 7.4 and older, not just 7.4.

Maybe a switch to set the destination version would be a good compromise?


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] what do postgresql with view ?

2004-04-06 Thread scott.marlowe
On 31 Mar 2004, elrik wrote:

 [EMAIL PROTECTED] (elrik) wrote in message news:[EMAIL PROTECTED]...
  In information i have:
  
  1. when creating view : PostgreSQL parse the query and stock the tree query.
  2. when using : PostgreSQL use this tree like a subselect.
  
  my question : Do PostgreSQL makes an analyse of the resulted tree ?
 
 
 I can answer by myself : YES !
 
 In fact the probleme is when i have more than 11 tables in the query...

Sounds like you might be wresting with the genetic query optimizer.  Try 
turning up the geqo theshold and see if things work better.


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] pg_dump end comment

2004-03-31 Thread scott.marlowe
On Wed, 31 Mar 2004, Philip Warner wrote:

 At 12:13 AM 31/03/2004, Bruce Momjian wrote:
 Yes, they have to check for a proper exit from pg_dump, but there is
 still a file sitting around after the dump, with no way to tell if it is
 accurate.
 
 Why don't we write a hash into the header or footer. Then use something like:
 
  pg_restore --verify dump-file
 
 if file integrity is the objective.

I like this idea.  Nice to be able to check the md5 sig to make sure a 
backup is not corrupted OR short.


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] Increasing security in a shared environment ...

2004-03-29 Thread scott.marlowe
On Mon, 29 Mar 2004, Marc G. Fournier wrote:

 On Mon, 29 Mar 2004, Dave Page wrote:
 
  It's rumoured that Euler Taveira de Oliveira once said:
   Hi Christopher,
  
The \l command should only list databases that the current user is
authorized for, the \du command should only list users authorized
for the current database (and perhaps only superusers should get
even that much information), etc.  Perhaps it is possible to set PG
to do this, but that should probably be the default.
   
   Seem reasonable. Why not prevent normal users to dig on the pg_catalog?
   What is the impact of it?
 
  Because they can't use tools like pgAdmin or phpPgAdmin unless they can at
  least read all the catalogs.
 
 k, but what I'm suggesting shouldn't prevent that, should it?  They should
 only be able to see those resources that they have permissions to see, not
 all of them ... no?

I think an auto-filtering system for \l and other backslash commands as 
needed, makes a lot more sense than trying 
to deny access to the catalogs.  Obscuring them for security reasons is no 
win, really.  Obscuring them so user number 1,000,000 in his own database 
doesn't have to look at user numbers 1 through 999,999 to see his database 
go by.

While I'm not sure I'd build a 1,000,000 user database, somewhere between 
the 80 we currently have at work and a few thousand you'd go nuts if you 
saw a bunch of data that didn't belong to you every time you hit \l.


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] Log rotation

2004-03-24 Thread scott.marlowe
On Wed, 24 Mar 2004, Andrew Dunstan wrote:

 . Peter Eisentraut's program
   pro: portable, better featured, no license issues
   con: code state uncertain, less well tested

Where is Peter's code available, I'd like to try it out.


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [HACKERS] Default Stats Revisited

2004-03-11 Thread scott.marlowe
On Thu, 11 Mar 2004, Josh Berkus wrote:

 Scott,
 
  I like it.  Would a multiplier be acceptable?  
  default_stats_index_multiplier = 10
 
 Yeah, I thought about that, but a multiplier would be harder to manage for 
 most people.I mean, what if your default_stats are at 25 and you want 
 your index_stats at 40?   PITA.   Also, if you wanted to increase the default 
 stats but *forgot* that the index_stats were a multiplier ...
 
 I think a straight number is less confusing.

But possible more error prone.  If you crank up the default statistics to 
50, but the index default is still 25...  OTOH, you could always have the 
setting of used for index default be whichever is greater... hmmm.


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] About hierarchical_query of Oracle

2004-03-10 Thread scott.marlowe
On Wed, 10 Mar 2004, Li Yuexin wrote:

 Who can tell me how to complete oracle's hierarchical_query through 
 postgresql?

Look in the contrib/tablefunc directory for the connect_by function.


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] raising the default default_statistics_target

2004-03-09 Thread scott.marlowe
On Sun, 7 Mar 2004, Tom Lane wrote:

 Neil Conway [EMAIL PROTECTED] writes:
  Tom Lane wrote:
  This is something we need to consider, but we'll need more evidence
  before making a choice.  One thing that we have very little data about
  is how much difference it makes in the quality of planner choices.
 
  Right, but is there a practical way to actually get this data?
 
 I haven't thought of one yet, but perhaps someone will have an idea.

Hi Tom.  I ran some very simple tests on analyze times and query plan 
times on a very simple table, with data randomly distributed.  The index 
was on a date field, since that's what I was testing last.

This was all done on my 512Meg memory 1.1GHz celeron workstation with an 
IDE drive.  I'd love more input on better testing methodologies here...

with 100k or 1M rows that look kinda like this:  (I'll test 10M rows 
later, which means the dataset won't fit in memory, so there'll be lots of 
access going on.  Right now the 1M row table is 80 meg)

 select * from test2 limit 5;
info | dt  |   id
-+-+-
 Francize perfectible swirling fluctuates| 2004-05-20 20:12:04 | 2721995
 Fields chauffeur attentionality grandmother | 2004-04-07 14:36:02 | 2721996
 Belgium bilked explosively defendant| 2004-09-16 16:27:22 | 2721997
 perspectives Buenos Pollux discriminates| 2004-11-11 12:28:31 | 2721998
 Victorianize Savonarola blackmails sufficed | 2004-02-27 21:17:20 | 2721999
(5 rows)

here's what I get with different statistics targets for analyze times:

100k1M  1M  
analyze analyze plan
target  ms  ms  ms  
10  250 875 2   
20  350 1250
30  430 1500
40  520 1725
50  580 1900
60  690 2100
70  775 2175
80  850 2300
90  950 2400
100 100026002.5 
200 18063700
300 26004800
400 26005900
500 26007200
700 26009500
1000260013000   5   

Since this data is randomly distributed, I didn't bother doing a lot of 
testing to see how accurate each target setting was.  If that would be 
useful to know I'd gladly test it, but I was only setting out to test the 
time to analyze and the time to plan.

Note that I only tested 3 targets for planning time, as it didn't seem to 
make a very big difference.  The query was:

select * from test2 where dt between 'march 11, 2004' and 'march 13, 2004';

I also ran some quick tests on smaller tables (1000 and 10k rows) and 
there, the plateau that we see in the 100k analyze shows up much quicker, 
at something like 50 or so.  I.e. the analyze time flattened out quickly 
and higher numbers cost very little if anything.

Since this query was quite an easy plan, I'd expect to need a much more 
complex one to test the increase in planning time, say something that has 
to look at a lot of statistics.  Any particular join type or something 
that's likely to do that?



---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [pgsql-hackers-win32] [HACKERS] Tablespaces

2004-03-05 Thread scott.marlowe
On Fri, 5 Mar 2004, Thomas Swan wrote:

 [EMAIL PROTECTED] wrote:
 
 [EMAIL PROTECTED] writes:
 
   
 
 [EMAIL PROTECTED] wrote:
   
 
 Zeugswetter Andreas SB SD [EMAIL PROTECTED] writes:
   
 
 My feeling is that we need not support tablespaces on OS's without
 symlinks.
 
 
 To create symlinked directories on Win2k NTFS see:
   http://www.sysinternals.com/ntw2k/source/misc.shtml#junction
 I think Win2000 or XP would be a reasonable restriction for Win32 PG
 installations that want tablespaces.
 
 
 Oh, good --- symlinks for directories are all that we need for this
 design.  I think that settles it then.
 
   
 
 What archival tools are there that would restore this to this back to
 the
 filesystem: tar? zip?  What would happen if a symlink were removed or
 pointed to an invalid location while the postmaste was running?
 
 
 Well, for backup, just run tar or find on /data with a flag to
 follow symlinks, and you are done.  Can't get much easier than
 that.
   
 
 I'm ruferring to NTFS and the win32 platforms.  How does tar handle
 these symlinks on the NTFS filesystem?  What about if someone finds
 that FAT32 is significantly better for the database?
 
 
 
 tar doesn't know anything about PostgreSQL system catalogs.  If we use
 symlinks for tablespaces then it would be possible to backup downed
 databases with a simple tar command on every platform *I* care about
 (and probably Windows too).  Using system catalogs for this stuff
 would simply guarantee that I would have to read the system catalogs
 and then back up each tablespace manually.  In short, your idea would
 trade off (maybe) having to backup tablespaces manually on a few
 platforms for the certainty of having to backup tablespaces manually
 on all platforms.
 
 How is that a win?
 
   
 
 Apparently, I have failed tremendously in addressing a concern. The
 question is does PostgreSQL need to rely on symlinks and will that
 dependency introduce problems? 
 
 There is an active win32 port underway (see this mailing list).   One
 proposal was to try to use an OS specific filesystem feature to perform
 a symlink on NTFS.  Can the special symlink that NTFS allegedly supports
 be archived the same way symlinks are archived on Unix?  If so, is there
 a utility that can do this (zip, tar, etc).  The backup operator would
 still need to know what directories needed to be archived in addtion to
 the pgdata directory.Is this symlink structure a normal/special file
 that can be archived by normal means (tar,zip, etc)?

According to this page:

http://www.linuxinfor.com/en/man1/ln.1.html

from the linux man pages, 
On existing implementations, if it is at all possible to make a hard link 
to a directory, this may be done by the superuser only. POSIX forbids the 
system call link(2) and the utility ln to make hard links to directories 
(but does not forbid hard links to cross filesystem boundaries).

and states that the command ln is POSIX 1003.2. However, POSIX 1003.2 
(1996) does not discuss soft links. Soft links were introduced by BSD, and 
do not occur in System V release 3 (and older) systems.

I fear the more useful of the two would be soft links, but if soft links 
are not a part of the POSIX standard, then postgresql probably shouldn't 
base key features on them unless said features would be onerous to 
implement without soft links.

Oddly enough though, Microsoft has now released their unix services 
package for free, and it comes with the commands to create a symbolic 
link, and runs on Windows NT 4.0. Windows 2000. Windows XP. and Windows 
Server 2003.

So, soft links would appear to not be a real non-starter, as long as the 
ability to make softlinks on those systems won't rely on having some 
strange package installed (like MS's Unix services package.)

I imagine there's a standard OS call in modern MS OSes that will let you 
create a symbolic link with no special libs installed, and if that's the 
case, the the argument that maybe FAT would be a better file system comes 
under the same heading as running your database on NFS.  Neither is a good 
idea, and PGSQL can't guarantee normal, reliable operation.




 
 Example:
 
 PGDATA is C:\pgdata
 I have a tablespace in Z:\1\ and Z:\2\
 There exists an alleged symlink in
 C:\pgdata\data\base\tablespaces\schmoo - Z:\1
 
 Can I archive [ C:\pgdata, Z:\1, Z:\2 ], restore them, and have
 postgresql working just as before?

Have you tried building said structure and backing up and restoring it to 
see?

 We aren't talking about a feature that work[s] on Linux on not on
 FreeBSD.  We are talking about a feature that works on every OS that
 suports symlinks (which includes even operating systems like Windows
 that PostgreSQL doesn't currently support).
 
 Hello?  What was this response from Tom Lane? My feeling is that we
 need not support tablespaces on OS's without symlinks.  That seems to
 be indicative of a feature set restriction 

Re: [HACKERS] BTrees with record numbers

2004-03-01 Thread scott.marlowe
On 27 Feb 2004, Chad wrote:

 Is it possible for Postgres Btrees to support access by logical row number ?
 If not available is ti a huge job to support for sombebody willing to have a go ?

Are talking about logical row operators as maintained by your own code 
outside the database, or having postgresql suddenly start maintaining 
logical row numbers?  I doubt postgresql will ever have built in logical 
row numbers because maintaining them in a materialized way is quite 
expensive.

On the other hand, if you just need logical row numbers for a one off 
thing, you can do this:

create temp sequence judy;
select *, nextval('judy') from mytable;


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] Tablespaces

2004-02-27 Thread scott.marlowe
On Fri, 27 Feb 2004, Zeugswetter Andreas SB SD wrote:

 
   Ahh. I forgot to detail my ideas on this. It seems to me that we cannot
   drop a table space until the directory is empty.
 
 Agreed.
 
  
  How would it get to be empty?  Are you thinking of some sort of connect
  database to tablespace and disconnect database from tablespace
  commands that would respectively create and delete the per-database
  subdirectory?  That seems moderately reasonable to me.  We could then
 
 I would only allow the drop if the directory only contains empty db oid 
 directories.

Wouldn't this be better tracked in the dependency tracking that's already 
built into postgresql?  Checking to see if the directory is empty is open 
to race conditions, but locking the dependency tracking while dropping a 
tablespace isn't.



---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [HACKERS] Tablespaces

2004-02-27 Thread scott.marlowe
On Fri, 27 Feb 2004, Gavin Sherry wrote:

 On Thu, 26 Feb 2004, Alex J. Avriette wrote:
 
  On Thu, Feb 26, 2004 at 11:22:28PM +1100, Gavin Sherry wrote:
 
   Certainly, table spaces are used in many ways in oracle, db2, etc. You can
   mirror data across them, have different buffer sizes for example.
   In some implementations, they can be raw disk partitions (no file system).
   I don't intend going this far, however.
 
  Perhaps now would be a good time to bring up my directio on Solaris question
  from a year or so back? Is there any interest in the ability to use raw
  disk?
 
 I do not intend to undertake raw disk tablespaces for 7.5. I'd be
 interested if anyone could provide some real world benchmarking of file
 system vs. raw disk. Postgres benefits a lot from kernel file system cache
 at the moment. Also, I believe that database designers have traditionally
 made bad file system designers. Raw database partitions often lack the
 tools essential to a scalable environment. For example, the ability to
 resize partitions.

Is possible / reasonable / smart and or dumb to look at implementing the 
tablespaces as riding atop the initlocation handled stuff.  I.e. 
postgresql can only create tablespaces in areas that are created by 
initlocation, thus keeping it in its box, so to speak?


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] Tablespaces

2004-02-27 Thread scott.marlowe
On Fri, 27 Feb 2004, Tom Lane wrote:

 scott.marlowe [EMAIL PROTECTED] writes:
  Is possible / reasonable / smart and or dumb to look at implementing the 
  tablespaces as riding atop the initlocation handled stuff.
 
 In my mind, one of the main benefits of this work will be that we'll be
 able to get *rid* of the initlocation stuff.  It's a crock.

OK, that's fine, but I keep thinking that a superuser should have to 
create the tablespace itself, and then tables can be assigned by users 
based on the rights assigned by the dba / superuser.  Is that how we're 
looking at doing it, or will any user be able to create a tablespace 
anywhere postgresql has write permission, or will only dbas be able to 
create AND use table spaces. I'm just not sure how that's gonna be 
handled, and haven't seen it addressed.


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS] Sparc optimizations

2004-02-24 Thread scott.marlowe
On Tue, 24 Feb 2004, Shridhar Daithankar wrote:

 http://www.osnews.com/printer.php?news_id=6136

That page gets a please don't link to printer ready pages error and 
redirects to here:

http://www.osnews.com/story.php?news_id=6136




---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] MS SQL features for new version

2004-02-17 Thread scott.marlowe
On Tue, 10 Feb 2004, Rodrigo wrote:

 Shridhar Daithankar wrote:
  Just stumbled upon this. just an FYI,
  
  http://www.microsoft.com/sql/yukon/productinfo/top30features.asp
  
   Shridhar
 
  From the page:
 
   A new Snapshot Isolation (SI) level will be provided at the
   database level. With SI, users will be able to access the
   last committed row using a transitionally consistent view
   of the database. This capability will provide greater
   scalability for very large database (VLDB) implementations.
 
 Is Snapshot Isolation == MVCC ?

I think it goes that MVCC is a kind of snap shot, but snap shotting could 
be provided by more ways than just MVCC.

But I'm not 100% certain on that.


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] How can I have 2 completely seperated databases in

2004-02-12 Thread scott.marlowe
On Wed, 11 Feb 2004 [EMAIL PROTECTED] wrote:

 Thank you very much for your reply.
 
 Yes, that's true. But it seems not a good idea if I have many databases
 and I want them totally seperated with each other.
 
 What's your opinion? Thanks.

OK, here's the issue.  Postgresql uses certain resources in a shared 
manner, and other resources are completely seperate.  For instance, the 
shared memory buffers are shared within a single instance or cluster, by 
all the databases.  One data directory, and one collation, and one set of 
logs are also shared by one instance.

The individual databases within a cluster share a set of global users.  
I.e if I create a user in one database, he can then be granted access to 
the other databases (or denied access) with a simple change to 
pg_hba.conf.  So, it's very easy to add / remove people's access to 
individual databases.

If you seperate out each database into it's own instance, you now have two 
(or more) postgresql instances, each with a different data directory, 
shared memory buffers, user list and passwords.  I.e. now nothing passes 
between them, at all.

If you would have allocated 128 megs of shared buffer memory for a single 
cluster which contained 4 databases, and you split those out into 
individual instances, then you'd need to give each of the four cluster / 
instances 32 megs of shared buffer or you'd risk using up more memory than 
before.

With the single cluster, if one database has a lot of data to sling 
around, and the others are handling a few k at a time, it has 128 Megs to 
work in.  With four clusters, no matter how little the other three are 
working, you'd only have 32 meg to play in.

Taking this to the logical extreme of having n databases, where n is 
fairly good size, say 20, 40, or 100, then you have issues that if you set 
up each database with enough shared memory to do its job when it needed 
to, you risk starting a swap storm should a couple dozen of those 
databases have a few large result sets open, thus using up all the shared 
memory they'd have alloted.

Lowering the shared memory for each database low enough to prevent this 
would result in individual databases that each had very small amounts of 
shared memory.

Also, maintainance gets harder.  You now have to vacuum multiple seperate 
clusters, and you need to schedule it so that you don't have two or three 
running at once and swamping your storage subsystem.

For certain setups, multiple clusters are a great thing.  I've used them 
as hot backups where I put a secondary instance online, placed its 
storage on a NAS, backed up to it, and then shut it down to have a cold 
live spare, or used it for testing administrative procedures one shouldn't 
practice on a live database.

But for seperating out applications from each other, there's really 
nothing to be gained by putting each seperate database application into 
it's own cluster.


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS] How can I have 2 completely seperated databases in

2004-02-12 Thread scott.marlowe
On Thu, 12 Feb 2004, Rod Taylor wrote:

  But for seperating out applications from each other, there's really 
  nothing to be gained by putting each seperate database application into 
  it's own cluster.
 
 I believe the initial email requested individual logs, and presumably
 the ability to grant superuser access without risking a user crossing
 into another clients space.

Well, I got the feeling it was only the logs he wanted seperate, and if 
that's all he wanted, then seperating will get him that, but there's a 
cost, and I wanted to make sure he understood that as well.

I'm not sure he was intimating seperate super users, but I could 
definitely see situations where that would be useful.


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] RFC: Query Planner making a distinction between Cross

2004-02-12 Thread scott.marlowe
On Thu, 12 Feb 2004, Stef wrote:

  U.  Postgresql doesn't natively support cross database queries...
  
 
 I know, but it does schema's, and currently, the same
 notation is used to specify schema's as 'cross database'.
 
 So the planner often reports 'cross-database not allowed'
 in areas where it should at least report 'cross-schema
 support is unavailable for this'
 
 case in point, the example trigger. i would have expected
 deliberate schemaname.table during an insert to work, but
 instead the parser complains about cross-database.

I would think just changing the error message to no schema by the name of 
suchandsuch found would make it pretty clear.


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] RFC: Query Planner making a distinction between Cross

2004-02-12 Thread scott.marlowe
On Thu, 12 Feb 2004, Stef wrote:

   case in point, the example trigger. i would have expected
   deliberate schemaname.table during an insert to work, but
   instead the parser complains about cross-database.
  
  I would think just changing the error message to no schema by the name of 
  suchandsuch found would make it pretty clear.
  
 
 indeed, the only problem being, that this is a 
 -deliberately- called schema and it does exist
 jst that its not in the search_path. surely the
 pl/pglsql or parser should be able to search the
 schemanames if i give it a -deliberate- name and
 if it isnt there say 'not found' ?
 
 so, either thats an error (improper parsing/expr
 forming) in the pl/pgsql trigger code, or, well,
 i dont know. 
 
 thoughts ? comments ? barking mad ?

Hmmm.  I would think the first step would be to simply change the cross-db 
queries aren't supported to one of schema either does not exist or is not 
in the search path.


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] How can I have 2 completely seperated databases in

2004-02-11 Thread scott.marlowe
On Wed, 11 Feb 2004 [EMAIL PROTECTED] wrote:

 Hi, all
 
 What should I do if I want to have 2 completely seperated databases in
 PostgreSQL? I want each database to have its own data, log and
 everything needed to access that database. I don't want them to share
 anything. Has anyone done this before? Or,
 could anyone give me some clue of how to do this?

You're telling us what you want, but not the why.  The why may help us to 
figure out both how to do it, and whether or not it's a good idea.


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] [PATCHES] Current-stream read for psql's \copy

2004-02-10 Thread scott.marlowe
On Tue, 10 Feb 2004, Tom Lane wrote:

 Bruce Momjian [EMAIL PROTECTED] writes:
  Tom Lane wrote:
  No it doesn't.  EOF will do fine.  The source program doesn't
  necessarily have to know anything about COPY, as long as its output is
  in a format COPY can cope with (eg, tab-delimited).
 
  The current behavior seems quite strange and counter-intuitive.  We
  might break the code for 1-2 people, but we will make it more
  predicable for everyone using it.
 
 I don't think it's acceptable to simply remove the functionality.
 If you wanted to argue about swapping the meanings of STDIN and -
 (as defined by the patch) then we could debate about which way is
 more consistent and whether it's worth breaking backwards compatibility
 to improve consistency.  I could probably be talked into supporting
 that; as you say, we've done that before.  But taking out a useful
 behavior that has been there a long time, simply because you've decided
 it's unintuitive, is not okay.

Why not make a -i switch (for input file) that does it the way Bruce wants 
and leave -f alone so people who depend on it behaving the way it does 
won't get a surprise in their next upgrade?


---(end of broadcast)---
TIP 8: explain analyze is your friend


[HACKERS] bug in substring???

2004-02-06 Thread scott.marlowe
I'm using substring.  Since I'm a coder more than a database guy, I 
expected this:

select substring('abcdefgh',0,4);

would give me 

abcd

but it gives me a left aligned 'abc'

select substring('abcdefgh',1,4);

works fine.

select substring('abcdefgh',-4,4);

gives me nothing.  Shouldn't a negative offset, or even 0 offset result in 
an error or something here?  Or is there a special meaning to a negative 
offset I'm not getting?

Just wondering.  




---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [HACKERS] bug in substring???

2004-02-06 Thread scott.marlowe
On Fri, 6 Feb 2004, Joe Conway wrote:

 scott.marlowe wrote:
  gives me nothing.  Shouldn't a negative offset, or even 0 offset result in 
  an error or something here?  Or is there a special meaning to a negative 
  offset I'm not getting?
 
 In varlena.c there is this comment:
 
   * text_substr()
   * Return a substring starting at the specified position.
   * - thomas 1997-12-31
   *
   * Input:
   *   - string
   *   - starting position (is one-based)
   *   - string length
   *
   * If the starting position is zero or less, then return from the start
   *  of the string adjusting the length to be consistent with the
   *  negative start per SQL92. If the length is less than zero, return
   *  the remaining string.

thanks.  I just got done looking up the SQL explanation, and I think my 
head exploded.  Thanks for the heads up.


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] Question on database backup

2004-02-04 Thread scott.marlowe
On Wed, 4 Feb 2004, Michael Brusser wrote:

 We have customers who prefer to use their backup facilities
 instead of what we provide in the app (we use pg_dump)
 I hear speed is at least one consideration.
 
 The questions I need to answer are these:
 
 1) Is this absolutely safe to do file copy (cpio, or 
 smth. else, whatever the robust backup app. would use)
 on the Postgres db, when it's completely shut down.

Yes, it is.

 2) Same question, but the database is up and running in
 read-only mode. We're making sure that no updates are 
 taking place.

Most likely, it is.  No guarantees if the database is up and running, even 
if you're certain there are no updates happening.

Also, you can use a snapshotting file system to make a backup image and 
then back up the image, while the database is up and being accessed, both 
by readers and writers, assuming you get REAL snapshots.

 If it matters - this is on Solaris, HP, Linux.
 We're using v.7.3.2 running on UDS and v.7.3.4 on TCP/IP
 We provide no explicit settings for wal, fsync and the like.
 And (yes, I know) they often install it on NFS.

OK, here's a couple more issues to chew on as well.  Suppose you have a 
backup, and the database server was compiled with UNKNOWN switches.  The 
machine burns to the ground.  now you get to try and figure out how to 
compile the database on the new server so it can read the old dataset.  
This may or may not be a complete friggin' nightmare for you.  Dumps can 
move between versions / hardware configs / differently compiled versions 
of postgresql with some amount of reliability.  binary copies, may or may 
not move so easily.

Scenario II, the HP burns to the ground, and your boss just buys a big old 
intel box.  how do you get your data up and running with a binary backup?  
you don't.

Scenario III.  Subtle corruption gets into your dataset due to a bad block 
or what not.  No one notices for a while.  Suddenly, someone notices.  
With only file system backups, with no error messages in them, how do you 
determine when the corruption occurred and get the uncorrupt data out 
leaving the corrupted behind?

Plain and simple.  Postgresql is designed to be backed up by pg_dump.  
Using anything else isn't supported so to speak, and may cause you 
untold grief in the future.  That said, sometimes file backups are the 
perfect solution, just go into with your eyes open to the possible 
problems, and I'd make a pg_dump every so often just in case.


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] Write cache

2004-01-28 Thread scott.marlowe
On Wed, 28 Jan 2004 [EMAIL PROTECTED] wrote:

 I agree I MAY have an hardware problem. What happens is more a system
 freeze than a system crash (there's no panic, no nothing, just freezes, no
 disk activity, not network)

I would suspect either bad hardware,a flakey SCSI driver, or a possible 
kernel bug.  If your system is freezing hard, it is NOT postgresql's 
fault.  It simply doesn't have the access to the kind of system resources 
needed to freeze a machine.

Is there a different SCSI driver / card you can try in there?  We've (and 
many others have too) had good luck with the LSI/MegaRAID cards and both 
the older 1.18 seris and new 2.x series drivers.  No freezes, no crashes, 
no hangs on the boxes with those in them.


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Bunch o' dead code in GEQO

2004-01-22 Thread scott.marlowe
On Wed, 21 Jan 2004, Tom Lane wrote:

 The GEQO planner module contains six different recombination algorithms,
 only one of which is actually used --- the others are ifdef'd out, and
 have been ever since we got the code.  Does anyone see a reason not to
 prune the deadwood?

considering the recent discussion about REALLY slow query planning by the 
GEQO module, it might be worth testing each one to see which works best 
before lopping them off.

But I don't do anything that needs GEQO, so for me, it doesn't really 
matter.


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] Bunch o' dead code in GEQO

2004-01-22 Thread scott.marlowe
On Thu, 22 Jan 2004, Tom Lane wrote:

 scott.marlowe [EMAIL PROTECTED] writes:
  On Wed, 21 Jan 2004, Tom Lane wrote:
  The GEQO planner module contains six different recombination algorithms,
 
  considering the recent discussion about REALLY slow query planning by the 
  GEQO module, it might be worth testing each one to see which works best 
  before lopping them off.
 
 I'm assuming that the original author of the GEQO code already did that
 testing ...

Hmmm.  I was figuring he wasn't sure so he left them in for other people 
to test.  Is this a part of the code that eats up much time, or something 
simple and fast that isn't part of the GEQO takes 8 seconds to plan 
problem?


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] cache control?

2004-01-16 Thread scott.marlowe
On Fri, 16 Jan 2004, Michael Brusser wrote:

 Is there a way to force database to load
 a frequently-accessed table into cache and keep it there?

Nope.  But there is a new cache buffer handler that may make it into 7.5 
that would make that happen automagically.


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] What's planned for 7.5?

2004-01-12 Thread scott.marlowe
On Mon, 12 Jan 2004, Martin Marques wrote:

 Mensaje citado por Marc G. Fournier [EMAIL PROTECTED]:
 
  
  Native Win32 is planned for it (whether it makes it or not is another
  question, but it is the goal) ...
 
 Replication wasn't another BIG one?

Actually, I think it was PITR (Point in Time Recovery).  Of course, since 
that works by replaying log files into the database, it's a pretty good 
way of setting up replication, which would likely follow a release or so 
after PITR was implemented.

OTOH, there are some replication solutions already available, just no as 
part of the core.


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] Project status pages

2003-12-22 Thread scott.marlowe
On Fri, 19 Dec 2003, Bruce Momjian wrote:

 Robert Treat wrote:
  Wasn't there a patch posted many months ago for PITR. IIRC it wasn't
  complete, but would be a good starting point for those interested in
  helping out. If it's in the archives it would be nice to add a link to
  it on the project page... which brings up the question on whats the
  process for updating these pages?  Perhaps they should be moved into the
  wiki framework up on techdocs?
 
 They are just web pages.  If someone want to set up wiki and pull the
 content, I can remove my versions.

http://techdocs.postgresql.org/guides is an open wiki instance, you could 
just put it there.  I've edited a couple of pages there.  Nothing fancy, 
but it is pretty easy to edit.  You don't (or didn't have to until 
recently if it's changed) have to register to use it.


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Resurrecting pg_upgrade

2003-12-16 Thread scott.marlowe
On Tue, 16 Dec 2003, Jon Jensen wrote:

 On Tue, 16 Dec 2003, Jan Wieck wrote:
 
  If you want to prevent accidential access, start postmaster on a 
  non-standard port.
 
 That seems like an unfriendly thing to do. You'd have to check to see what 
 port is standard for this particular installation, and pick something 
 else. You may choose an unused port, but perhaps it needs to be used in a 
 few minutes by some other process, but then will be occupied. The 
 administrator may also not be happy to have an open port facing the world, 
 or even just other possibly untrusted users on the same machine, assuming 
 you bind to localhost.

But aren't ports above a certain number fair game?

Yep, just answered my own question, quoting from /etc/services:

The latest IANA port assignments can be gotten from
#   http://www.iana.org/assignments/port-numbers
# The Well Known Ports are those from 0 through 1023.
# The Registered Ports are those from 1024 through 49151
# The Dynamic and/or Private Ports are those from 49152 through 65535

so as long as we use 49152 and above we're cool.


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Performance features the 4th

2003-11-10 Thread scott.marlowe
On Sun, 9 Nov 2003, Jan Wieck wrote:

 scott.marlowe wrote:
 
  On Fri, 7 Nov 2003, Matthew T. O'Connor wrote:
  
  - Original Message - 
  From: Jan Wieck [EMAIL PROTECTED]
   Tom Lane wrote:
Gaetano and a couple of other people did experiments that seemed to show
it was useful.  I think we'd want to change the shape of the knob per
later suggestions (sleep 10 ms every N blocks, instead of N ms every
block) but it did seem that there was useful bang for little buck there.
  
   I thought it was sleep N ms every M blocks.
  
   Have we seen any numbers? Anything at all? Something that gives us a
   clue by what factor one has to multiply the total time a VACUUM
   ANALYZE takes, to get what effect in return?
  
  I have some time on sunday to do some testing.  Is there a patch that I can
  apply that implements either of the two options? (sleep 10ms every M blocks
  or sleep N ms every M blocks).
  
  I know Tom posted the original patch that sleept N ms every 1 block (where N
  is  10 due to OS limitations).  Jan can you post a patch that has just the
  sleep code in it? Or should it be easy enough for me to cull out of the
  larger patch you posted?
  
  The reason for the change is that the minumum sleep period on many systems 
  is 10mS, which meant that vacuum was running 20X slower than normal.  
  While it might be necessary in certain very I/O starved situations to make 
  it this slow, it would probably be better to be able to get a vacuum that 
  ran at about 1/2 to 1/5 speed for most folks.  So, since the delta can't 
  less than 10mS on most systems, it's better to just leave it at a fixed 
  amount and change the number of pages vacuumed per sleep.
 
 I disagree with that. If you limit yourself to the number of pages being 
 the only knob you have and set the napping time fixed, you can only 
 lower the number of sequentially read pages to slow it down. Making read 
 ahead absurd in an IO starved situation ...
 
 I'll post a patch doing
 
  every N pages nap for M milliseconds
 
 using two GUC variables and based on a select(2) call later.

I didn't mean fixed in the code  I meant in your setup.  I.e. find a 
delay (10mS, 50, 100 etc...) then vary the number of pages processed at a 
time until you start to notice the load, then back it off.

Not being forced by the code to have one and only one delay value, setting 
it yourself.


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] Performance features the 4th

2003-11-07 Thread scott.marlowe
On Fri, 7 Nov 2003, Matthew T. O'Connor wrote:

 - Original Message - 
 From: Jan Wieck [EMAIL PROTECTED]
  Tom Lane wrote:
   Gaetano and a couple of other people did experiments that seemed to show
   it was useful.  I think we'd want to change the shape of the knob per
   later suggestions (sleep 10 ms every N blocks, instead of N ms every
   block) but it did seem that there was useful bang for little buck there.
 
  I thought it was sleep N ms every M blocks.
 
  Have we seen any numbers? Anything at all? Something that gives us a
  clue by what factor one has to multiply the total time a VACUUM
  ANALYZE takes, to get what effect in return?
 
 I have some time on sunday to do some testing.  Is there a patch that I can
 apply that implements either of the two options? (sleep 10ms every M blocks
 or sleep N ms every M blocks).
 
 I know Tom posted the original patch that sleept N ms every 1 block (where N
 is  10 due to OS limitations).  Jan can you post a patch that has just the
 sleep code in it? Or should it be easy enough for me to cull out of the
 larger patch you posted?

The reason for the change is that the minumum sleep period on many systems 
is 10mS, which meant that vacuum was running 20X slower than normal.  
While it might be necessary in certain very I/O starved situations to make 
it this slow, it would probably be better to be able to get a vacuum that 
ran at about 1/2 to 1/5 speed for most folks.  So, since the delta can't 
less than 10mS on most systems, it's better to just leave it at a fixed 
amount and change the number of pages vacuumed per sleep.

I'm certainly gonna test the patch out too.  We aren't really I/O bound, 
but it would be nice to have a database that only slowed down ~1% or so 
during vacuuming.


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] Hacking PostgreSQL to work in Mac OS X 10.3 (Panther

2003-11-05 Thread scott.marlowe
Is this a bug we should fix for 7.3.5 when it eventually comes out?

On Tue, 4 Nov 2003, Andrew Rawnsley wrote:

 
 Just build RC1 today on Panther, no problems.
 
 
 On Nov 4, 2003, at 5:06 PM, Jeff Hoffmann wrote:
 
  Tom Lane wrote:
  [EMAIL PROTECTED] writes:
  After spending a few hours of trying to get Postgresql7.3.4 to build
  from source (tar.gz) on a Panther (release, not beta) system,
  Try 7.4RC1 instead.  Apple made some incompatible changes in their
  compiler in Panther.
 
  I was going to recommend the same thing.  I compiled a 7.4 beta out of 
  the box without a hitch, so I'd assume the RC would be fine as well.
 
  -- 
 
  Jeff Hoffmann
  PropertyKey.com
 
 
  ---(end of 
  broadcast)---
  TIP 8: explain analyze is your friend
 
 
 
 Andrew Rawnsley
 President
 The Ravensfield Digital Resource Group, Ltd.
 (740) 587-0114
 www.ravensfield.com
 
 
 ---(end of broadcast)---
 TIP 5: Have you checked our extensive FAQ?
 
http://www.postgresql.org/docs/faqs/FAQ.html
 
 


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] 7.4RC1 tag'd, branched and bundled ...

2003-11-05 Thread scott.marlowe
On Tue, 4 Nov 2003, Gaetano Mendola wrote:

 I agree in general with you for these general arguments, but here we 
 are talking about to introduce a sleep ( removable by guc ) or not! What 
 about the hash refactoring introduced with 7.4? Are you going to
 discourage people to use the hash?

That's not fair.  Everyone's had the chance to test 7.4 betas 1 through 4 
on their own hardware to see if it is safe and stable.  If I've load 
tested the betas and found them reliable, and suddenly find that 7.4 rc1 
or release were to be less reliable (I'm not talking about THIS patch in 
particular, I'm talking about ANY patch...) than the betas I would be 
upset that the change happened at the end instead of the beginning of the 
release cycle.

I think this patch is a great idea, but it's probably got a lot of 
refinement coming down the line.  I'd be glad to test it on the 7.4 branch 
as a patch, but it's just too late to put performance enhancements into 
the main branch.  Now is bug fixing time, not performance tweaking time.  
That was pre-beta1 really.

To review: The hasgagg has had testing, and lots of it, by the people who 
need it.  This patch has had little testing.  While I'm sure it's a fairly 
safe change, and likely to be a good thing overall, it's just too late in 
the cycle for it to get in.


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Very poor estimates from planner

2003-11-05 Thread scott.marlowe
On Wed, 5 Nov 2003, Rod Taylor wrote:

 Since this is a large query, attachments for the explains / query.
 
 Configuration:
 dev_iqdb=# select version();
 version
 
  PostgreSQL 7.4beta1 on i386-portbld-freebsd4.7, compiled by GCC
 2.95.4
 (1 row)
 
 SET default_statistics_target = 1000;
 ANALYZE;
 set from_collapse_limit = 100;
 set join_collapse_limit = 20;

I'm not sure if that will actually change the default_statistics_target of 
the tables you're analyzing, I think it will only apply to newly created 
tables.  

I believe you have to alter table alter column set statistics 1000 for 
each column you want a statistic of 1000.  You might wanna try starting 
with 50 or 100 and see if that works first.


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] Very poor estimates from planner

2003-11-05 Thread scott.marlowe
On Wed, 5 Nov 2003, Tom Lane wrote:

 Rod Taylor [EMAIL PROTECTED] writes:
  I'm not sure if that will actually change the default_statistics_target
 
  Hmm.. I was under the impression that it would work for any tables that
  haven't otherwise been overridden.
 
 It will.  I think Scott is recalling the original circa-7.2
 implementation, where it wouldn't.  If you're unsure that you
 affected it, check out the actual sizes of the array values in pg_stats.

Hey, can anyone guess what version I'm running in production :-)  Thanks 
for the catch.


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] Experimental patch for inter-page delay in VACUUM

2003-11-04 Thread scott.marlowe
On Tue, 4 Nov 2003, Tom Lane wrote:

 Jan Wieck [EMAIL PROTECTED] writes:
  What still needs to be addressed is the IO storm cause by checkpoints. I 
  see it much relaxed when stretching out the BufferSync() over most of 
  the time until the next one should occur. But the kernel sync at it's 
  end still pushes the system hard against the wall.
 
 I have never been happy with the fact that we use sync(2) at all.  Quite
 aside from the I/O storm issue, sync() is really an unsafe way to do a
 checkpoint, because there is no way to be certain when it is done.  And
 on top of that, it does too much, because it forces syncing of files
 unrelated to Postgres.
 
 I would like to see us go over to fsync, or some other technique that
 gives more certainty about when the write has occurred.  There might be
 some scope that way to allow stretching out the I/O, too.
 
 The main problem with this is knowing which files need to be fsync'd.

Wasn't this a problem that the win32 port had to solve by keeping a list 
of all files that need fsyncing since Windows doesn't do sync() in the 
classical sense?  If so, then could we use that code to keep track of the 
files that need fsyncing?


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] 7.4RC1 planned for Monday

2003-10-31 Thread scott.marlowe
On Thu, 30 Oct 2003, Joshua D. Drake wrote:

 If I understood correctly, Josh was complaining about VACUUM sucking too
 
 much of his disk bandwidth.  autovacuum wouldn't help that --- in fact
 would likely make it worse, since a cron-driven vacuum script can at
 least be scheduled for low-load times of day.  autovacuum is likely to
 kick in at the least convenient times.
 
   
 
 Exactly!

Wait a minute, I thought the problem was that vacuums were happening too 
far apart, therefore taking too long, and may have been full, no?

If the autovacuum daemon causes a lazy vacuum to run on only the busiest 
(i.e. most updated) tables, then it is likely to only take a few minutes 
to run, instead of hours, plus you can try to keep things steady state.  
I.e. no more than x% or so dead tuples in a table at any given time, and 
they all get reused by fsm / lazy vacuum.

So, have you TESTED the autovacuum daemon with your load, and set it to 
run every 5 minutes?  Keep in mind, it won't actually vacuum every table 
every 5 minutes, it'll just check the stats to see which ones have updated 
a fair bit and vacuum those, and they're lazy vacuums.  I've found it to 
be a win.  If you haven't tested it and dismissed it out of hand, then you 
should really give it a try to see if it can be configured to provide good 
performance and behavior.


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] Deadlock problem

2003-10-30 Thread scott.marlowe
On 30 Oct 2003, Vatsal Avasthi wrote:

 
 Hi,
   I am facing a strange problem and thats bugging me for a long time,
 I am using postgres version 7.2.1.

Is it possible for you to upgrade to 7.2.4 just to make sure it's not a 
problem that was fixed from 7.2.1 to 7.2.4?


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] pg_user

2003-10-30 Thread scott.marlowe
On Thu, 30 Oct 2003, ivan wrote:

 
 you can also patch your kernel and when you write cat /etc/passwd system
 give you only your line , whitout any others users, so exacly what you
 need ,
 in pgsql i think that users dont need to know about others , and also
 them
 databases, i call it security :)

technically, that's just obscurity.



---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [HACKERS] 7.4RC1 planned for Monday

2003-10-30 Thread scott.marlowe
On Thu, 30 Oct 2003, Joshua D. Drake wrote:

 Hello,
 
I know I will probably be flamed into oblivion for this but I would 
 like to make a suggestion about
 the upcoming release.
 
What if we delayed until the end of the year?
 
The two reasons that I can come up with are:
 
1. The irritating (but work around capable) bigint index issue.
2. More importantly the recent potential discovery by Jan on vacuum.
 
   I have several high end users that are really beating their heads 
 against the wall with even lazy vacuum
 because of how brutal it can be on the system. If we could make vacuum a 
 little less harsh it could be
 a large boon.

Are these folks for whom the autovacuum daemon provides no relief?

I think it's too late in the release cycle to put everything on hold, 
especially with no known fix in sight (for bigint at least.)


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] O_DIRECT in freebsd

2003-10-29 Thread scott.marlowe
On 29 Oct 2003, Doug McNaught wrote:

 Christopher Kings-Lynne [EMAIL PROTECTED] writes:
 
  FreeBSD 4.9 was released today.  In the release notes was:
  
  2.2.6 File Systems
  
  A new DIRECTIO kernel option enables support for read operations that
  bypass the buffer cache and put data directly into a userland
  buffer. This feature requires that the O_DIRECT flag is set on the
  file descriptor and that both the offset and length for the read
  operation are multiples of the physical media sector size.
  
  Is that of any use?
 
 Linux and Solaris have had this for a while.  I'm pretty sure it's
 been discussed before--search the archives.  I think the consensus
 was that it might be useful for WAL writes, but would be a fair amount
 of work and would introduce portability issues...

I would think the biggest savings could come from using directIO for 
vacuuming, so it doesn't cause the kernel to flush buffers.

Would that be just as hard to implement?  


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [BUGS] [HACKERS] Autocomplete TAB on Postgres7.4beta5 not

2003-10-28 Thread scott.marlowe
On Tue, 28 Oct 2003, Tom Lane wrote:

 Rod Taylor [EMAIL PROTECTED] writes:
  I say leave it the way it is. If you want system table tab completion,
  simply:
  ALTER USER ... SET search_path =3D pg_catalog,...;
 
 Unfortunately, that *does not* affect the tab-completion behavior;
 it will still not offer the system catalogs as completions unless
 you explicitly prefix pg_catalog..

It seems a good compromise then would be that if pg_catalog is in your 
search path, then do the old fashioned completion, i.e. ptab will list 
everything in both pg_catalog and the other schemas in your search path.

Afterall, the system catalog kind of hitchhikes along for a ride in your 
search path.  Most users aren't interested in the system catalogs, so 
having them suddenly show up when you just wanted the phonebook table is 
kinda a bother for them.

Is it possible to remove the implicit search path of pg_catalog from a 
psql session without it breaking lots of stuff?  If not, then it's kind of 
ugly to the user to have no way to get the system from proffering 
pg_catalog tables when they have no interest in them.  If so, then why 
make it part of the default?




---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [BUGS] [HACKERS] Autocomplete TAB on Postgres7.4beta5 not

2003-10-28 Thread scott.marlowe
On Tue, 28 Oct 2003, Tom Lane wrote:

 scott.marlowe [EMAIL PROTECTED] writes:
  Is it possible to remove the implicit search path of pg_catalog from a 
  psql session without it breaking lots of stuff?
 
 Do you consider +, count(), etc to be important stuff?

Me, hardly ever use them  :-)  So I can assume that removing the implicit 
pg_catalog from the search path is a bad thing.

In that case, does my proposed solution of having to implicitly include 
pg_catalog in your search path to get it to be seen by tab completion as 
just another schema make sense?


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] round() function wrong?

2003-10-24 Thread scott.marlowe
On Wed, 22 Oct 2003, Jochen Westland [invigo] wrote:

 Hi All,
 i'm running Postgresql 2.2x, so i am not quitse sure wether the bug i am reporting 
 is already fixed
 in newer versions or not.
 
 In my version 
 select round(2.5); returns 2;
 select round(2.501) returns 3;
 
 refering to my math professor thats wrong, at least in germany.
 select round(2.5); should return 3

I just tried that on my 7.2.4 and 7.4 beta 4 machines and I get 2 for 
round(2.5)


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] round() function wrong?

2003-10-24 Thread scott.marlowe
On Fri, 24 Oct 2003, Michael Brusser wrote:

 But this seems to work correctly on 7.3.2 and 7.3.4:
 psql -c select round (2.5)
 Password: 
  round 
 ---
  3
 (1 row)
 
 =
  
  I just tried that on my 7.2.4 and 7.4 beta 4 machines and I get 2 for 
  round(2.5)

Ackkk.  I accidentally tested it on my 7.3.4 box, not my 7.4 beta 4 box. 
but the output is the same.  The original, I believe, had '' marks in it.

Anyway, it seems to matter about the ''s in 7.3.x:


In pgsql 7.2:

select round(2.5::float);
 round
---
 2
(1 row)

select round(2.5::numeric);
 round
---
 3
(1 row)

select round(2.5);  -- would appear to be coerced to float here)
 round
---
 2
(1 row)

select round('2.5');
ERROR:  Function 'round(unknown)' does not exist
Unable to identify a function that satisfies the given argument 
types
You may need to add explicit typecasts


pgsql 7.3.4:


select round(2.5::float);
 round
---
 2
(1 row)

select round(2.5::numeric);
 round
---
 3
(1 row)

select round(2.5);  -- would appear to be coerced to numeric
 round
---
 3
(1 row)

select round('2.5');
 round
---
 2
(1 row)

pgsql 7.4 beta5: behaves the same as 7.3.4

So it would appear to be that the automatic assumptions about what is 
float and what is numeric changed from 7.2 to 7.3, i.e. it's assumed that 
numeric is the input type.

But I'm just guessing here.




---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] Failed to create temporary file

2003-10-23 Thread scott.marlowe
Sounds like your drives are full.

On Thu, 23 Oct 2003, Yuval Lieberman wrote:

 Hi!
 
 I'm doing a select (from an OACS page or from psql) and I get:
 ERROR:  Failed to create temporary file pgsql_tmp/pgsql_tmp27212.775
 
 The same select work ok a different database (which is on a different
 machine)
 
 the select is :
 select a.attribute_id, a.pretty_name,
a.ancestor_type, t.pretty_name as ancestor_pretty_name
   from acs_object_type_attributes a,
(select t2.object_type, t2.pretty_name,
  tree_level(t2.tree_sortkey) - tree_level(t1.tree_sortkey) + 1 as
 type_level
   from acs_object_types t1, acs_object_types t2
   where t1.object_type = 'group'
 and t2.tree_sortkey between t1.tree_sortkey and
 tree_right(t1.tree_sortkey)) t
  where a.object_type = 'group'
and t.object_type = a.ancestor_type
 order by type_level
 
 
 ---(end of broadcast)---
 TIP 5: Have you checked our extensive FAQ?
 
http://www.postgresql.org/docs/faqs/FAQ.html
 
 


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] integer ceiling in LIMIT and OFFSET

2003-10-22 Thread scott.marlowe
On Wed, 22 Oct 2003, Tom Lane wrote:

 Rod Taylor [EMAIL PROTECTED] writes:
  That said, perhaps the TODO for changing LIMIT / OFFSET to be expression
  based should also mention bumping them to int8.
 
 Can't get excited about it ... this would slow down the normal use of
 the facility for what seems a completely hypothetical need.

While I'm pretty sure 

select * from sometable limit 2147483648

isn't gonna be common, maybe someone would be likely to do something like:

select * from sometable limit 10 offset 2147483648

I wouldn't do it, but who knows what shadows lurk in men's minds?


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [HACKERS] So, are we going to bump catversion for beta5, or not?

2003-10-22 Thread scott.marlowe
On Wed, 22 Oct 2003, Tom Lane wrote:

 Richard Huxton [EMAIL PROTECTED] writes:
  On Wednesday 22 October 2003 06:55, Peter Eisentraut wrote:
  The idea is that you give each function its own schema search path at
  creation time, and that path applies to that function for the rest of its
  life.  Then that function would be immune to schema path changes later on.
 
  But surely that would mean I couldn't do ...
 
 Certainly you can invent scenarios where letting the search path vary
 from call to call is useful, but the question is which behavior is
 *more* useful.  I think it's becoming clear that having a predictable
 search path is usually what a function author will want.
 
 It would probably be a good idea to allow the function's search path to
 be explicitly specified as a clause of CREATE FUNCTION (otherwise it
 will be a headache for pg_dump).  So we could allow both viewpoints,
 if there is a way to explicitly say don't force any search path.
 Perhaps specifying an empty path could mean that.  But I think the
 default should be to adopt the current search path (at the time of
 CREATE FUNCTION) as the function's permanent path.

It might be nice to have an alter function capability that could change 
the search path at a later date should one add schema etc... later on.


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] pg_ctl reload - is it safe?

2003-10-14 Thread scott.marlowe
On 14 Oct 2003, Greg Stark wrote:

 
 Michael Brusser [EMAIL PROTECTED] writes:
 
   Michael Brusser [EMAIL PROTECTED] writes:
2003-10-10 22:37:05 ERROR:  cannot read block 0 of s_noteimportlinks:
Interrupted system call
   
   Hmm.  I found this hard to believe at first, but indeed my local man
   pages for read() and write() say they can return EINTR if interrupted
   by a signal.  This may depend on the filesystem in use (are you using
   NFS?)
 
 The traditional unix semantics are the read/write my return EINTR if
 interrupted -- but that that would only EVER happen for network connections.
 The traditional semantics are that it would NEVER happen on disk i/o. BSD
 kernels at least, and probably all unix kernels, do an uninterruptible sleep
 on disk accesses, hence the dreaded D in ps listings.
 
  Yes, we use NFS. Many of our customers use it as well.
 
 Normally NFS guarantees the traditional unix semantics. 
 Unless you're using either soft or intr options.
 
 If you are, well, stop.
 
 If you use intr then this type of thing can happen. Lots of programs assume
 the unix semantics for disk accesses. You can get all kinds of bugs when
 they're violated.
 
 If you use soft then the consequences can be much much worse. If your
 fileserver were to reboot you could silently lose disk writes corrupting your
 database.

What if the WAL was local on disk, and the data was going to nfs storage, 
would that be safe, or saferer?  :-)


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


[HACKERS] IDE Drives and fsync

2003-10-08 Thread scott.marlowe
OK, I've done some more testing on our IDE drive machine.

First, some background.  The hard drives we're using are Seagate 
drives, model number ST380023A.  Firmware version is 3.33.  The machine 
they are in is running RH9.  The setup string I'm feeding them on startup 
right now is:  hdparm -c3 -f -W1 /dev/hdx

where:

-c3 sets I/O to 32 bit w/sync (uh huh, sure...)
-f sets the drive to flush buffer cache on exit
-W1 turns on write caching

The drives come up using DMA.  turning unmask IRQ on / off has no affect 
on the tests I've been performaing.

Without the -f switch, data corruption due to sudden power down is an 
almost certain.  Running 'pgbench -c 5 -t 100' and pulling the plug 
will result in recovery failing with the typical invalid page type 
messages.

the pgbench database was originally set to -s 1 when initializing.

If I turn off write caching (-W0) then the data is coherent no matter how 
many concurrents I'm running, but performance is abysmal (drops from ~ 200 
tps down to 45, 10 if I'm using /dev/md0, a mirror set.)  This is all on a 
single drive.

If I use -W1 and -f, then I get corruption on about every 4th test or so 
if the number of parallel beaters is 50 or so.  If I crank it up to 200 or 
increase the size of the database by using -s 10 during initilization.  
Note that EITHER a larger test database OR a larger number of clients 
seems to increase the chance of corruption.

I'm guessing that the with -W1 and -f, what's happening is that at lower 
levels of parallel access, or a larger data set, the time between when the 
drive reports and fsync and when it actually writes the data out is 
climbing, and it is more likely that data that is in transit to the wal is 
getting lost during the power plug pull.

Tom, you had mentioned adding a delay of some kind to the fsync logic, and 
I'd be more than willing to try out any patch you'd like to toss out to me 
to see if we can get a semi-stable behaviour out of IDE drives with the 
-W1 and -f switches turned on.  As it is, the performance is quite good, 
and under low to medium loads, it seems to be capable of surviving the 
power plug being pulled, so I'm wondering if we can come up with a slight 
delay, that might drop the performance some small percentage while 
greatly decreasing the chance of data corruption.

Is this worth looking into?  I can see plenty of uses for a machine that 
runs on IDE for cost savings, while still providing a reasonable amount of 
data security in case of power failure, but I'm not sure if we can get rid 
of the problem completely or not.


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] Question regarding coopting Database Engine

2003-10-03 Thread scott.marlowe
On Tue, 30 Sep 2003, Steve Yalovitser wrote:

 Hello,
 
 I'd like to know if its possible to coopt the postgres storage subsystem to
 rely entirely on ram based structures, rather than disk. Any documentation
 or ideas would be appreciated.

Sure, create a ram disk.  Set $PGDATA to it with proper permissions, 
initdb, and restore.

Not sure why'd you wanna do it, as any crash loses all data, and 
postgresql's strengths lie in its ability to survive the most brutal power 
off situations mid-transaction etc...


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] Thoughts on maintaining 7.3

2003-10-01 Thread scott.marlowe
On Tue, 30 Sep 2003, Joshua D. Drake wrote:

 Hello,
 
   With the recent stint of pg_upgrade statements and the impending 
 release of 7.4 what
 do people think about having a dedicated maintenance team for 7.3? 7.3 
 is a pretty
 solid release and I think people will be hard pressed to upgrade to 7.4. 
 Of course
 a lot of people will, but I have customer that are just now upgrading to 
 7.3 because
 of legacy application and migratory issues.
 
Anyway I was considering a similar situation to how Linux works where 
 their is a
 maintainer for each release... Heck even Linux 2.0 still released until 
 recently.
 
   Of course the theory being that we backport some features and fix 
 any bugs that
 we find?
 
What are people's thoughts on this?

It seems to me the upgrade from 7.2 to 7.4 is easier than an upgrade to 
7.3, since at least 7.4's pg_dumpall can connect to a 7.2 database and 
suck in everything, whereas in 7.3 I had to dump with 7.2's dumpall and 
then tweak the file by hand a fair bit to get it to go into 7.3.

With 7.4 I'm finding upgrading to be easier.  I'll likely upgrade out 
production servers to 7.4.0 when it comes out and wind up skipping 7.3 
altogether.


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] Wednesday beta postponed till Thursday

2003-10-01 Thread scott.marlowe
On Tue, 30 Sep 2003, Tom Lane wrote:

 It seems some junior electrician in Panama pulled the wrong circuit
 breaker ... and then the mail.postgresql.org server spent an
 unreasonable number of hours fsck'ing.  (Why is Marc a FreeBSD fan
 anyway?  Don't ask me, I work for Red Hat.)  Anyhow, due to the loss
 of project communications for today, it seems best to put off tomorrow's
 intended 7.4beta4 release for a day.  We'll plan Thursday instead.

Hey Jim!?  What does this big red switch do?
@#$% NO CARRIER


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] Thoughts on maintaining 7.3

2003-10-01 Thread scott.marlowe
On Wed, 1 Oct 2003, Joshua D. Drake wrote:

 
 With 7.4 I'm finding upgrading to be easier.  I'll likely upgrade out 
 production servers to 7.4.0 when it comes out and wind up skipping 7.3 
 altogether.
   
 
 
 Sure but I talking about people who are running 7.3 and are happy with 
 it. The reality is that for probably 95% of the people
 out there , there is no reason for 7.4. When you have existing system 
 that works... why upgrade? That is one of the benefits
 of Open Source stuff, we no longer get force into un-needed upgrade cycles.

Agreed, we've been on 7.2 for a while now because it just works.  
The regex substring introduced in 7.3 was a pretty cool feature, for 
instance, that makes life easy.

 When you deal with the systems I do, the cost to a customer to migrate 
 to 7.4 would be in the minimum of 10,000-20,000 dollars.
 They start to ask why were upgrading with those numbers.

then maybe they would be willing to donate some small amount each ($500 or 
so) to pay for backporting issues.  Since mostly what I'd want on an older 
version would be bug / security fixes, that $500 should go a long way 
towards backporting.

 That is not to say that 7.4 is not worth it from a technical sense but 
 for my customers, If it ain't broke, don't fix it is a mantra and
 the reality is that 7.3 is not broke in their minds. There is 
 limitations pg_dump/pg_restore has some issues, having to reindex the 
 database
 (which 7.4 doesn't fix), vacuum (which 7.4 doesn't fix) but my customers 
 accept them as that.

I was under the imporession that 7.4 removed the need to reindex caused by 
monotonically increasing index keys, no?

 Your mileage may vary but I can only talk from my experience.

Yeah, I would rather have had more back porting to 7.2 because there were 
tons of little improvements form 7.2 to 7.3 I could have used while 
waiting for 7.4's improved pg_dumpall to come along.

Cheers:-)


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] Lost mails

2003-10-01 Thread scott.marlowe
On Wed, 1 Oct 2003, Darko Prenosil wrote:

 Two mails with updated translations for /src/backend/po/hr.po are lost.
 First time I send clear po file, second tar.gz - no result.
 Is something blocking mails with attachment ? I didn't receive notification
 that mail is blocked or something like that.
 Can I try to send it to some other address ?

Just send it again, yesterday a junior eletrical tech in Panama turned off 
the mail server for the lists for a little while.


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] feature request: show pgsql version when running initdb

2003-09-26 Thread scott.marlowe
On Fri, 26 Sep 2003, Christopher Kings-Lynne wrote:

 Is there any chance we could have initdb show the version of postgresql
 it is running as when initdb is run?
  
  
  If you install many different versions in parallel, don't you give your
  installation paths some meaning that contain the version number?  In any
  case, you can run initdb --version first if you're not sure about what is
  where.
 
 Yes I do, but sometimes as different users you don't know what the path 
 is.  I guess I can just go --version.

Anytime I'm running multple versions on the same box, I install them to 
paths like /usr/local/pgsql72 and create a user like pgsql72 and then set 
it so the path for that use is /usr/local/pgsql72/bin.

that way, just looking at the command prompt I know which version / 
superuser I'm playing with.  Plus, no matter how badly I try to screw up, 
pgsql72 doen't have permission to initdb pgsql73's $PGDATA directory.


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] initdb failure (was Re: [GENERAL] sequence's plpgsql)

2003-09-26 Thread scott.marlowe
On Fri, 26 Sep 2003, Bruce Momjian wrote:

 Tom Lane wrote:
  Peter Eisentraut [EMAIL PROTECTED] writes:
   Tom Lane writes:
   so it appears that cygwin's echo generates a different newline style
   than what got put into sql_features.txt.  A possible way to fix this is
   to put the \. line into sql_features.txt, but maybe there's a cleaner
   answer.  Peter, any thoughts?
  
   There's no clean answer to this on Cygwin.  This specific case is just a
   little problem that we could solve locally, but in general you'll just end
   up annoying people if you require them to use consistent line endings on
   Cygwin.
  
  Yeah, I was wondering whether you wouldn't propose dropping the newline
  consistency check.  I'm not very comfortable with that, but maybe we
  should.  Bruce?
 
 I posted on that a few minutes ago.  Yea, we can drop it, but we risk
 eating carraige returns as data values.  I am not sure how consistently
 we output literal carriage returns in old dumps, nor how many apps
 produce on literal carriage returns in COPY. If we conditionally eat
 them, we run the risk of discarding some of their data without warning. 
 Perhaps we can throw a warning rather than an error, and adjust initdb
 to be consistent.

I'm running into issues where 7.4's pg_dump/pg_dumpall from a 7.2 database 
to a 7.4beta3 database is producing some errors like this:

ERROR:  literal newline found in data
HINT:  Use \n to represent newline.
CONTEXT:  COPY FROM, line 59

ERROR:  literal carriage return found in data
HINT:  Use \r to represent carriage return.
CONTEXT:  COPY FROM, line 41


These show up with little or no context, only the line number of the 
dump file.  Since I'm wrapping these up in pg_dumpall, I don't have the 
dump file so I don't know where the error is really occuring.  It would be 
nice to have such occurances echo the table / row they are getting the 
error on, or maybe just the first 20 or so characters, so they'd be easier 
to identify.

Is this related to this issue?


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] initdb failure (was Re: [GENERAL] sequence's plpgsql)

2003-09-26 Thread scott.marlowe
On Fri, 26 Sep 2003, Tom Lane wrote:

 scott.marlowe [EMAIL PROTECTED] writes:
  I'm running into issues where 7.4's pg_dump/pg_dumpall from a 7.2 database 
  to a 7.4beta3 database is producing some errors like this:
 
  ERROR:  literal newline found in data
  HINT:  Use \n to represent newline.
  CONTEXT:  COPY FROM, line 59
 
  ERROR:  literal carriage return found in data
  HINT:  Use \r to represent carriage return.
  CONTEXT:  COPY FROM, line 41
 
 Really?  7.2 should dump data \r or \n as the backslash versions ...
 and does in my tests.  Can you make a reproducible test case?

The attached file produces this problem.  Note it's a blank trailing field 
that looks to be causing it.  The error for this .sql file is:

ERROR:  literal carriage return found in data
HINT:  Use \r to represent carriage return.
CONTEXT:  COPY FROM, line 2

Note that loading this into pico and saving it back out fixes the problem.

If I remove the preceding row that doesn't end in a blank field, I get a 
different error, this one:

ERROR:  end-of-copy marker does not match previous newline style
CONTEXT:  COPY FROM, line 2

  It would be nice to have such occurances echo the table / row they are
  getting the error on, or maybe just the first 20 or so characters, so
  they'd be easier to identify.
 
 That's not a bad idea.  I think it would be fairly easy now for the
 CONTEXT line of the error message to include the input data line:
 
   CONTEXT:  COPY FROM, line 41: data here 
 
 at least up through the field where the error gets thrown, and with some
 limit on the length of the data that will get echoed.  If people like
 that idea I'll see about making it happen.

table name too, like Bruce said.  The bothersome bit is that in pg_dump, 
it says the line, relative to just this part of the copy command, so you 
don't even know which table is giving the error.
--
-- PostgreSQL database dump
--

\connect - marl8412

--
-- TOC entry 2 (OID 283043147)
-- Name: people2; Type: TABLE; Schema: ; Owner: marl8412
--

CREATE TABLE people2 (
id integer,
persons text
);


--
-- Data for TOC entry 3 (OID 283043147)
-- Name: people2; Type: TABLE DATA; Schema: ; Owner: marl8412
--

COPY people2 (id, persons) FROM stdin;
59  Chance Terry--S
60  

\.



---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] initdb failure (was Re: [GENERAL] sequence's plpgsql)

2003-09-26 Thread scott.marlowe
On Fri, 26 Sep 2003, Bruce Momjian wrote:

 scott.marlowe wrote:
  The attached file produces this problem.  Note it's a blank trailing field 
  that looks to be causing it.  The error for this .sql file is:
  
  ERROR:  literal carriage return found in data
  HINT:  Use \r to represent carriage return.
  CONTEXT:  COPY FROM, line 2
  
  Note that loading this into pico and saving it back out fixes the problem.
  
  If I remove the preceding row that doesn't end in a blank field, I get a 
  different error, this one:
  
  ERROR:  end-of-copy marker does not match previous newline style
  CONTEXT:  COPY FROM, line 2
 
 OK, 'vi' shows it as:
   
   COPY people2 (id, persons) FROM stdin;
   59  Chance Terry--S
   60  ^M
   \.
 
 which is _exactly the case the error was supposed to catch.  Now, the
 big question is where did this dump come from?  Pg version?  OS platform?

The originating system is a RedHat 7.2 box with postgresql 7.2.x running 
on it.

The destination system is a RedHat 7.2 box with postgresql 7.4 beta3 
running on it.

The data likely came out of a (gasp, horrors) windows box.


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] initdb failure (was Re: [GENERAL] sequence's plpgsql)

2003-09-26 Thread scott.marlowe
On Fri, 26 Sep 2003, Bruce Momjian wrote:

 scott.marlowe wrote:
   OK, 'vi' shows it as:
 
 COPY people2 (id, persons) FROM stdin;
 59  Chance Terry--S
 60  ^M
 \.
   
   which is _exactly the case the error was supposed to catch.  Now, the
   big question is where did this dump come from?  Pg version?  OS platform?
  
  The originating system is a RedHat 7.2 box with postgresql 7.2.x running 
  on it.
  
  The destination system is a RedHat 7.2 box with postgresql 7.4 beta3 
  running on it.
  
  The data likely came out of a (gasp, horrors) windows box.
 
 OK, try this on your 7.2:
 
   test= create table test(x text);
   CREATE TABLE
   test= insert into test values ('\r');
   INSERT 17158 1
   test= copy test to '/tmp/out';
   COPY
 
 Then 'vi' /tmp/out. It should show \r, not ^M.  Please report back.

I'm not much of a vi guy, so the out file thing didn't tell me much, but 
if I try and dump it from the 7.4 beta3 box into a new table, I get:

ERROR:  end-of-copy marker does not match previous newline style
CONTEXT:  COPY FROM, line 2



---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] initdb failure (was Re: [GENERAL] sequence's plpgsql)

2003-09-26 Thread scott.marlowe
On Fri, 26 Sep 2003, Bruce Momjian wrote:

 scott.marlowe wrote:
   OK, 'vi' shows it as:
 
 COPY people2 (id, persons) FROM stdin;
 59  Chance Terry--S
 60  ^M
 \.
   
   which is _exactly the case the error was supposed to catch.  Now, the
   big question is where did this dump come from?  Pg version?  OS platform?
  
  The originating system is a RedHat 7.2 box with postgresql 7.2.x running 
  on it.
  
  The destination system is a RedHat 7.2 box with postgresql 7.4 beta3 
  running on it.
  
  The data likely came out of a (gasp, horrors) windows box.
 
 OK, try this on your 7.2:
 
   test= create table test(x text);
   CREATE TABLE
   test= insert into test values ('\r');
   INSERT 17158 1
   test= copy test to '/tmp/out';
   COPY
 
 Then 'vi' /tmp/out. It should show \r, not ^M.  Please report back.

Figured out iv (-b switch, man pages rock) and it's a ^M



---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] initdb failure (was Re: [GENERAL] sequence's plpgsql)

2003-09-26 Thread scott.marlowe
On Fri, 26 Sep 2003, Peter Eisentraut wrote:

 Bruce Momjian writes:
 
  The argument that you want a warning because you might have mixed
  newlines in the file seems less likely than this case where they are
  using a literal carriage return as a data value at the end of the line.
 
 I don't agree with that assessment.  Who actually has CRs in their data?

I do.  And so do lots of other people.


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] initdb failure (was Re: [GENERAL] sequence's plpgsql)

2003-09-26 Thread scott.marlowe
On Fri, 26 Sep 2003, Peter Eisentraut wrote:

 scott.marlowe writes:
 
  but I get basically the same thing if I dump it to a .sql file and do:
 
  psql dbname dbname.sql
 
 Use psql -f dbname.sql instead.

and the output is:

psql:webport.sql:803: ERROR:  function odbc_user already exists with 
same argument types
REVOKE
REVOKE
GRANT
You are now connected as new user ayousuff.
psql:webport.sql:869: ERROR:  literal newline found in data
HINT:  Use \n to represent newline.
CONTEXT:  COPY FROM, line 59
You are now connected as new user smarlowe.
You are now connected as new user ayousuff.
CREATE INDEX
CREATE INDEX
CREATE INDEX


the same.  It doesn't tell me which table in my dump caused the problem, 
and it certainly isn't line 59 of the dump file, but of the table 
producing the error.


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] initdb failure (was Re: [GENERAL] sequence's plpgsql)

2003-09-26 Thread scott.marlowe
On Fri, 26 Sep 2003, Tom Lane wrote:

 Peter Eisentraut [EMAIL PROTECTED] writes:
  scott.marlowe writes:
  but I get basically the same thing if I dump it to a .sql file and do:
  psql dbname dbname.sql
 
  Use psql -f dbname.sql instead.
 
 This doesn't seem like a good argument not to add more information to
 the CONTEXT line for COPY errors.  Sure, in theory the existing info
 should be sufficient, but what if the information is not coming in
 through psql?  (For instance, maybe the COPY data is being generated
 on-the-fly by some other program.)  Or what if the dump file is so large
 you can't easily edit it to determine which line number is in question?
 There are plenty of scenarios where it's not all that convenient to
 triangulate on a problem from outside information.  Minimalism isn't
 really a virtue in error reports anyway.
 
 I'm thinking maybe:
 
   CONTEXT:  COPY tablename, line 41: data ...
 
 would serve the purpose nicely.

Yeah, just having the table name and line number would be plenty for me.  
It's the lack of a table name that makes it so frustrating.  I had to 
basically dump / restore the tables one at a time to figure out which one 
was causing the error.  On a database with hundreds of tables, that could 
be painful.


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] initdb failure (was Re: [GENERAL] sequence's plpgsql)

2003-09-26 Thread scott.marlowe
On Fri, 26 Sep 2003, Peter Eisentraut wrote:

 scott.marlowe writes:
 
  table name too, like Bruce said.  The bothersome bit is that in pg_dump,
  it says the line, relative to just this part of the copy command, so you
  don't even know which table is giving the error.
 
 I don't see the problem.  Can't you identify the failing command by the
 line number that psql gives you?

OK, here's the output from pg_dump while it's running

ERROR:  function odbc_user already exists with same argument types
REVOKE
REVOKE
GRANT
You are now connected as new user ayousuff.
ERROR:  literal newline found in data
HINT:  Use \n to represent newline.
CONTEXT:  COPY FROM, line 59
You are now connected as new user smarlowe.
You are now connected as new user ayousuff.
CREATE INDEX
CREATE INDEX
CREATE INDEX
CREATE INDEX
CREATE INDEX
You are now connected as new user smarlowe.
CREATE INDEX
CREATE INDEX
CREATE INDEX
CREATE INDEX

So how am I supposed to figure out which table and which row broke?  Keep 
in mind, I'm doing this:

pg_dump -h otherserver dbname|psql -h desthost dbname

but I get basically the same thing if I dump it to a .sql file and do:

psql dbname dbname.sql




---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] PostgreSQL not ACID compliant?

2003-09-19 Thread scott.marlowe
On Fri, 19 Sep 2003, Joshua D. Drake wrote:

 Hello,
 
   I just read a rather disturbing post
 
   PostgreSQL does not support read uncommited and repeatable read 
 isolation levels? If that is so... then PostgreSQL is NOT ACID compliant?
 
   What is the real deal on this?

Postgresql supports Serializable transactions, which are 100% ACID 
compliant. 

I'm pretty sure read committed mode is also ACID compliant since anything 
that would cause a failure of ACID compliance would result in a rollback, 
but I'm not sure.  


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] Maximum table size

2003-09-09 Thread scott.marlowe
On Tue, 9 Sep 2003, Tom Lane wrote:

 Bruce Momjian [EMAIL PROTECTED] writes:
  Is our maximum table size limited by the maximum block number?
 
 Certainly.
 
  Is the 16TB number a hold-over from when we weren't sure block number
  was unsigned, though now we are pretty sure it is handled as unsigned
  consistenly?
 
 It's a holdover.  As to how certain we are that all the
 signed-vs-unsigned bugs are fixed, who have you heard from running a
 greater-than-16Tb table?  And how often have they done CLUSTER, REINDEX,
 or even VACUUM FULL on it?  AFAIK we have zero field experience to
 justify promising that it works.
 
 We can surely fix any such bugs that get reported, but we haven't got
 any infrastructure that would find or prevent 'em.

any chance OSDL could test it?


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] pgsql vc++|win32

2003-09-08 Thread scott.marlowe
On Mon, 8 Sep 2003, Bruce Momjian wrote:

 Dann Corbit wrote:
  Mingw uses the native Win32 libraries.
  
  Porting from a Mingw port to VC++ will be trivial compared to what we
  have now.
   
   where can I access latest dev source code and dev docs in 
   the/from CVS ?
  
  Maybe you want the Win32 page.  There are some links to it in recent
  messages here.
 
 URL is:
 
   http://momjian.postgresql.org/main/writings/pgsql/win32.html

Just a reminder to everyone looking at porting that you wanna check the 
licensing from the software from MS you might be linking in.


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] Seqscan in MAX(index_column)

2003-09-05 Thread scott.marlowe
Would it be possible to catch an unconstrained max(id)/min(id) and rewrite 
it as select id from table order by id [desc] limit1 on the fly in the 
parser somewhere?

That would require fairly little code, and be transparent to the user.  
I.e. low hanging fruit.

On 5 Sep 2003, Greg Stark wrote:

 Note that that only handles min()/max() for the whole table. It doesn't handle
 the GROUP BY case, for that you need DISTINCT ON with an ORDER BY clause.


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] Nasty problem in hash indexes

2003-08-29 Thread scott.marlowe
On Fri, 29 Aug 2003, Tom Lane wrote:

 scott.marlowe [EMAIL PROTECTED] writes:
  On Thu, 28 Aug 2003, Neil Conway wrote:
  On Thu, Aug 28, 2003 at 05:37:39PM -0400, Tom Lane wrote:
  shrug Who's to say?  We've found bugs in the btree logic recently,
  too.
  
  I'd rather print a loud warning when a hash index is created, but keep
  the code in the tree, than just remove it entirely.
 
  Postgresql's philosophy has always seemed to be correctness first, 
  convenience and performance second.
 
 I agree --- we either fix this bug or remove hash indexes.  There's no
 third choice.  However, I don't agree with killing hash indexes just
 because there *might* be more bugs in them.  If we have an impractical-
 to-fix bug in front of us, then it's time for harsh measures, but
 otherwise ...

Sorry if I gave the impression earlier that we should get rid of hash 
indexes because there might be more bugs.  I didn't really mean it that 
way.  I just meant that if this one was going to be a hard fix, then that 
might be one of the mitigating factors for how much work someone's going 
to be willing to put into this.  

If it's an easy fix then it's likely worth the effort to keep the hash 
indexes around.


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Nasty problem in hash indexes

2003-08-28 Thread scott.marlowe
On Thu, 28 Aug 2003, Tom Lane wrote:

 I've traced through the failure reported here by Markus Kräutner:
 http://archives.postgresql.org/pgsql-hackers/2003-08/msg01132.php
 
 What is happening is that as the UPDATE adds tuples (all with the same
 hash key value) to the table, the hash bucket being filled eventually
 requires more pages, and this results in a _hash_splitpage() operation
 (which is misnamed, it should be _hash_splitbucket).  By chance, the
 bucket that is selected to be split is the one containing the older key
 values, all of which get relocated to the new bucket.  So when control
 returns to the indexscan that is sourcing the tuples for the UPDATE,
 there are no tuples remaining in the bucket it is looking at, and it
 exits thinking it's done.
 
 I'm not sure how many variants on this problem there might be, but
 clearly the fundamental bug is that a hash bucket split takes no account
 of preserving the state of concurrent index scans.
 
 This is likely to be messy to fix :-(.  A brute-force solution may be
 possible by generalizing hash_adjscans so that it can update indexscans
 of our own backend for bucket-split operations; we'd have to rely on
 page locking to prevent problems against scans of other backends.  The
 locking aspect is particularly unattractive because of the possibility
 of deadlocks.  If a bucket split fails because of deadlock, we're
 probably left with a corrupt hash index.
 
 Does anyone see a better way?
 
 Does anyone want to vote to jettison the hash index code entirely?
 Personally I'm not eager to put a lot of work into fixing it.

I've had naught but bad experiences with hash indexes myself.  Maybe toss 
it and see if someone wants to reimplement it some day in the future?

If I'm reading this right, this bug means you could do:

select * from table where field in (1,2,3,4)

where you should get say 100 rows, and you might not get all 100 rows?  If 
so, then how many other bugs are lurking in the hash index code waiting to 
bite?


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] Bumping block size to 16K on FreeBSD...

2003-08-28 Thread scott.marlowe
On Thu, 28 Aug 2003, Marc G. Fournier wrote:

 
 
 On Thu, 28 Aug 2003, Thomas Swan wrote:
 
  Has anyone looked at changing the default block size across the board
  and what the performance improvements/penalties might be?  Hardware has
  changed quite a bit over the years.
 
 I *think* that the reason for the performance improvement on FreeBSD is
 that our FS block size is 16k, instead of 8k ... are there any other
 OSs that have increased theirs?

Linux, is still, as far as I know, limited to the max page size of the CPU 
it's on, which for most x86 is 4k.

Windows 2k can go up to 64k block sizes.


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] Oversight?

2003-08-14 Thread scott.marlowe
On Tue, 12 Aug 2003, Christopher Kings-Lynne wrote:

   rbt=3D# ALTER USER rbt SET CONSTRAINTS ALL DEFERRED;
   ERROR:  syntax error at or near ALL at character 32
   rbt=3D# ALTER USER rbt SET CONSTRAINTS =3D DEFERRED;
   ERROR:  constraints is not a recognized option
 
  SET CONSTRAINTS ALL DEFERRED is a SQL-spec-mandated command syntax.
  Any similarity to Postgres' SET var = value syntax ends with the
  initial keyword.
 
 I assume his point is how do we set all of a user's constraints deferred by
 default?

But constraints aren't really per user, are they?  I can see wanting to 
set all of a table's contraints / dependent table constraints to deferred, 
but a user's constraints doesn't make much sense to me.


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS] logging stuff

2003-08-06 Thread scott.marlowe
On Tue, 5 Aug 2003, Andrew Dunstan wrote:

 
 (Responding to the deafening silence regarding my posts a couple of days 
 ago about logging dbnames and disconnections) ;-)
 
 The dbname patch is now done. If nobody objects to the format 
 ([db:yourdbname]) I'll submit it - I did it that way to make it fairly 
 easy to split a log file based on it, although you would have to be 
 careful with multiline log entries such as query strings. It is 
 intentionally minimalist.
 
 I had some thoughts about logging disconnections - I can see a way to do 
 it via an on_proc_exit handler, I think. Then I started wondering if it 
 might be useful to log session times instead of just noting a disconnect 
 and letting the user have to calculate the time.
 
 But I won't bother with this if there's no interest. *I* have no current 
 use for it, but I could well imagine others might. (I might too in the 
 future if I wanted to debug my connection pooling app).

Actually, I'd certainly like to see it done (both dbname and disconnect).

I'd guess the deafening silence was more because of no objctions than lack 
of interest.  I know for me it was.


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] logging stuff

2003-08-05 Thread scott.marlowe
If we're looking at this, we might want to look at how apache does it with 
it's customlog feature.  This allows you to first define custom log types, 
then set them according to which virtual server you're setting up.

I could see that being nice so you could create a couple of different 
custom log types, then use one for one database, and another for a 
different database.  

For people running large numbers of databases on a single cluster, it 
might be quite useful to be able to log each database differently.

On Tue, 5 Aug 2003, Bruce Momjian wrote:

 
 Another interesting idea, if we do a single log variable with parameters
 separated by commas, is to allow some syntax where you could specify the
 delimiter between fields, so it could be:
 
   log_line:   |, dbname, username
 
 or
 
   log_line:, dbname, username
 
 ---
 
 Josh Berkus wrote:
  Tom,
  
I'm prepared to be guided by concensus, though.
  
   I'm not dead set on it either, just wanted to raise a flag.  Who else
   has an opinion?
  
  From my perspective, we could really use a delimiter between the fields of 
  log output which is unlikely to appear within those fields instead of parsing 
  by character count, rather than making dbname a special case.
  
  Or do we already do this and I'm not parsing my log right?
  
  -- 
  Josh Berkus
  Aglio Database Solutions
  San Francisco
  
  ---(end of broadcast)---
  TIP 8: explain analyze is your friend
  
 
 


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] truncate all?

2003-08-04 Thread scott.marlowe
I agree, a plain truncate blasting a whole database is a very bad thing.

however, truncate with cascade would be quite useful.

On Mon, 4 Aug 2003, Bruce Momjian wrote:

 
 This this a TODO?  Keep in mind if we follow the syntax of VACUUM and
 (7.4) CLUSTER, that the all-database truncate would just be TRUNACATE.
 That seems very risky to me.  I wonder if the risk is worth adding this
 feature.
 
 ---
 
 Robert Treat wrote:
  On Mon, 2003-08-04 at 05:40, Shridhar Daithankar wrote:
   On 4 Aug 2003 at 11:25, Andreas wrote:
Would it be possible to implement a truncate all that purges all 
tuples 
from *all* tables, without taking account any rules or triggers, but 
leaving all table structures and rules, triggers, functions, etc intact 
(sequences do not need to reinitialized)?

As far as I understand, the no truncate if table is referenced change was 
introduced to ensure database integrity. However, if the referencing table 
is truncated, too, there should be no problem as far as foreign keys are 
concerned, correct?

The rationale behind this suggestion is that in our project we need a 
*quick* way to get rid of all the tuples in all tables in order to 
accelerate the reinitialization of the database for our unit tests. This 
needs to be done fairly often, and so the quicker the unit tests run, the 
easier it will be to include many unit tests in our project, thus ensuring 
that we can develop efficiently and safely in postgresql.

If you know of some other *quick* way to truncate all tables, please let us 
know. BTW: Starting and later rolling back a transaction will not work, as 
   
   As a workaround, I would dump the schema to a file using pg_dump, drop the 
   database and recreate it from schema.
   
   Will that do for you? Unfortunately that is not transaction safe and any 
   clients connected at that time needs to disconnect first. Hopefully you can do 
   that in the test environment.
   
  
  Truncate isn't transaction safe either, so that shouldn't be a problem. 
  
  Proper syntax for his feature would seem like: 
  truncate table [cascade|restrict] ?
  
  
  Robert Treat
  -- 
  Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL
  
  
  ---(end of broadcast)---
  TIP 8: explain analyze is your friend
  
 
 


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


  1   2   3   >