Re: [HACKERS] unsupported platforms

2003-11-14 Thread Peter Eisentraut
Christopher Kings-Lynne writes:

 I anyone going to email the people who last reported the unsupported
 platforms to see if they'll re-test?

 Shall I?  Or should someone more official?

From the latest list, all but the few odd NetBSD ports are known not to
work.  And I've posted a request to test NetBSD on -general a week or so
ago.  So I think we'd have heard something by now.  But feel free to
solicit more people.  Maybe writing to a NetBSD project mailing list would
work.

-- 
Peter Eisentraut   [EMAIL PROTECTED]


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] Any more must fix issues for 7.4?

2003-11-14 Thread Christof Petig
Bruce Momjian schrieb:
Peter Eisentraut wrote:

Bruce Momjian writes:


Oh, I forgot about that.  This leaves datetime.h and decimal.h in
/pgsql/include.  I don't see how 7.4.1 can fix that because people will
not be using initdb.
This has nothing to do with initdb.


Right.  I mean install isn't going to remove those files in /include.

If you use a packaged version (.deb/.rpm etc) it will remove them. So no 
problem for most(?) users.

   Christof

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


[HACKERS] INSERT extremely slow with large data sets

2003-11-14 Thread Slavisa Garic
Hi Everyone,
 
This is my first post here so please tell me to go somewhere else if this
is the wrong place to post questions like this.
 
I am using PostgreSQL 7.3.2 and have used earlier versions (7.1.x onwards) 
and with all of them I noticed same problem with INSERTs when there is a
large data set. Just to so you guys can compare time it takes to insert
one row into a table when there are only few rows present and when there
are thousands:

Rows PresentStart Time  Finish Time

100 1068790804.12   1068790804.12
10001068790807.87   1068790807.87
50001068790839.26   1068790839.27
1   1068790909.24   1068790909.26
2   1068791172.82   1068791172.85
3   1068791664.06   1068791664.09 
4   1068792369.94   1068792370.0
5   1068793317.53   1068793317.6
6   1068794369.38   1068794369.47

As you can see if takes awfully lots of time for me just to have those
values inserted. Now to make a picture a bit clearer for you this table 
has lots of information in there, about 25 columns. Also there are few
indexes that I created so that the process of selecting values from there
is faster which by the way works fine. Selecting anything takes under 5
seconds.

Any help would be greatly appreciated even pointing me in the right
direction where to ask this question. By the way I designed the database
this way as my application that uses PGSQL a lot during the execution so
there was a huge need for fast SELECTs. Our experiments are getting larger
and larger every day so fast inserts would be good as well.

Just to note those times above are of INSERTs only. Nothing else done that
would be included in those times. Machine was also free and that was the
only process running all the time and the machine was Intel(R) Pentium(R)
4 CPU 2.40GHz.

Regards,
Slavisa




---(end of broadcast)---
TIP 8: explain analyze is your friend


[HACKERS] XML Docbook

2003-11-14 Thread Karel Zak

 Hi,

 what use pure XML Docbook (non-SGML) for 7.5 PostgreSQL docs?

 XML advantage:

 - more clean and simple conversion into printable
   formats by FO (Formatting Objects),
 - needn't huge TeX stuff (!),
 - Java based XSLT/FO processors like FOP (support PDF, PCL, PS, SVG, Print,
   AWT, MIF and TXT),
 - conversion to HTML/MAN/etc by xsltproc,
 - XML based reference docs is usable for example in clients programs as
   standard help, because integrate XML parser into program is more simple and
   usual than SGML parser.

 I think the current docs build system is usable for normal user only if he wants
 to generate HTML docs, but for example it's useless if he wants PDF output...
 
 For example you can compare our current doc/src/sgml/Makefile with following 
 code for XML conversion:
 
 pdf:
fop -xsl /path/to/stylesheet/xsl/fo/docbook.xsl -xml book.xml -pdf book.pdf
 
 ps:
fop -xsl /path/to/stylesheet/xsl/fo/docbook.xsl -xml book.xml -ps book.ps

 html:
xsltproc -o htmldir/ /path/to/stylesheet/html/chunk.xsl book.xml

 man:
xsltproc /path/to/stylesheet/manpages/docbook.xsl ref.xml


  Comments?

Karel
-- 
 Karel Zak  [EMAIL PROTECTED]
 http://home.zf.jcu.cz/~zakkr/

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] XML Docbook

2003-11-14 Thread Jean-Michel POURE
Le Vendredi 14 Novembre 2003 10:19, Karel Zak a crit :
 KDE project use XML docbook and I think they have same problems and
 maybe already solutions too :-)
 http://i18n.kde.org/translation-howto/doc-translation.html
 Karel

Dear Karel,

Nice link with detailed information.
This is a valid reason for moving to XML docbook.

Cheers, Jean-Michel


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] XML Docbook

2003-11-14 Thread Peter Eisentraut
Karel Zak writes:

  XML advantage:

All very true.

XML disadvantage:

- no arbitrary parameter entities

If someone can solve this for me, I'm ready to switch.

Follow-up to [EMAIL PROTECTED] please.

-- 
Peter Eisentraut   [EMAIL PROTECTED]


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] XML Docbook

2003-11-14 Thread Karel Zak

On Fri, Nov 14, 2003 at 10:32:10AM +0100, Peter Eisentraut wrote:
 XML disadvantage:
 
 - no arbitrary parameter entities

 I unsure if I understand, can you show some example of this problem?

 I think there is  a lot of XML Docbook docs in a  lot of projects and I
 will wonder if in the PostgreSQL docs is something special what disable
 use XML instead SGML.

 Follow-up to [EMAIL PROTECTED] please.

 Hmm.. I must subscribe first :-)

Karel

-- 
 Karel Zak  [EMAIL PROTECTED]
 http://home.zf.jcu.cz/~zakkr/

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] heads up -- subtle change of behavior of new initdb

2003-11-14 Thread Greg Stark

 + if (!chmod(pg_data,0700))

Out of curiosity, what was the rationale for using 0700? I know it was a pain
for me when I had a script to monitor the tmp usage. Surely read access to
privileged users isn't really a problem? I'm thinking more of loosening the
paranoia check elsewhere rather than this default.

Wouldn't at least 0750 be safe? That way putting a user in the postgres group
would grant him access to be able to browse around and read the files in
pg_data.

Actually I should think 02750 would be better so that the group is inherited
by subdirectories.

-- 
greg


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] heads up -- subtle change of behavior of new initdb

2003-11-14 Thread Peter Eisentraut
Greg Stark writes:

 Wouldn't at least 0750 be safe? That way putting a user in the postgres group
 would grant him access to be able to browse around and read the files in
 pg_data.

That assumes that there is a restricted postgres group, which is not
guaranteed.

-- 
Peter Eisentraut   [EMAIL PROTECTED]


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] heads up -- subtle change of behavior of new initdb

2003-11-14 Thread Andrew Dunstan
The shell script said this:

   $ECHO_N fixing permissions on existing directory $PGDATA... 
$ECHO_C
   chmod go-rwx $PGDATA || exit_nicely

There's no more rationale than that for this patch.

I'm inclined to agree with you, though.

cheers

andrew

Greg Stark wrote:

+ 		if (!chmod(pg_data,0700))
   

Out of curiosity, what was the rationale for using 0700? I know it was a pain
for me when I had a script to monitor the tmp usage. Surely read access to
privileged users isn't really a problem? I'm thinking more of loosening the
paranoia check elsewhere rather than this default.
Wouldn't at least 0750 be safe? That way putting a user in the postgres group
would grant him access to be able to browse around and read the files in
pg_data.
Actually I should think 02750 would be better so that the group is inherited
by subdirectories.
 



---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


Re: [HACKERS] cvs head? initdb?

2003-11-14 Thread Robert Treat
On Fri, 2003-11-14 at 10:32, Jan Wieck wrote:
 Bruce Momjian wrote:
  Jan Wieck wrote:
  Christopher Browne wrote:
  
   [EMAIL PROTECTED] (elein) writes:
   What is the status of CVS head?  Isn't it in sync with 7.4.RC2?  I
   just upgraded from CVS and rebuilt clean and initdb now gives this
   lovely informative initdb failed message.
   
   No, I believe that they have started applying 7.5 patches, notably
   including reimplementing initdb in C.
   
   Jan checked in the ARC changes last night, and he and Tom ran into
   some problems, leading to that being pulled back out, while he
   revisits the code.
  
  Yeah, there was a problem with *extreme* sharing ... the code tried to 
  use the same buffer for multiple disk blocks at the same time, and 
  somehow the backends did not agree on the correct content. But it's 
  fixed and back in. You can see ARC working by setting
  
   buffer_strategy_status_interval = 10 # seconds
  
  and starting postmaster with -d1
  
  It is not on by default?
  
 
 Sure not. Why would someone turn on debug messages by default?
 
 Or did you mean ARC itself? Since it replaced the old LRU code, it is 
 the only choice you have now. Which sort of raises the question if we 
 would want to have multiple choices, like a config option
 
 buffer_replacement_strategy = lru|lru2|arc
 

people would always want to have those choices (especially for doing
development/testing/benchmarking between the different methods) the
question is is it worth the effort to give people those options?

Robert Treat
-- 
Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] Background writer process

2003-11-14 Thread Jan Wieck
Shridhar Daithankar wrote:

On Friday 14 November 2003 03:05, Jan Wieck wrote:
For sure the sync() needs to be replaced by the discussed fsync() of
recently written files. And I think the algorithm how much and how often
to flush can be significantly improved. But after all, this does not
change the real checkpointing at all, and the general framework having a
separate process is what we probably want.
Having fsync for regular data files and sync for WAL segment a comfortable 
compramise?  Or this is going to use fsync for all of them.

IMO, with fsync, we tell kernel that you can write this buffer. It may or may 
not write it immediately, unless it is hard sync. 
I think it's more the other way around. On some systems sync() might 
return before all buffers are flushed to disk, while fsync() does not.

Since postgresql can afford lazy writes for data files, I think this could 
work.
The whole point of a checkpoint is to know for certain that a specific 
change is in the datafile, so that it is safe to throw away older WAL 
segments.

Jan

--
#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #
---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] heads up -- subtle change of behavior of new initdb

2003-11-14 Thread Tom Lane
Greg Stark [EMAIL PROTECTED] writes:
 I'm not suggesting making that the default setup, just loosening the
 paranoia check so that if an admin sets the directory to be group
 readable the database doesn't refuse to start up.

In previous discussions of this point, paranoia was considered desirable.
I don't think the situation has changed.

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


[HACKERS] JDBC with 7.4RC2

2003-11-14 Thread Adam Witney

Should the jdbc driver compile ok with 7.4RC2?

I configure like so

./configure --with-perl --with-java --with-libs=/sw/lib
--with-includes=/sw/include

But it fails with this

compile:

BUILD FAILED
file:/usr/local/install/postgresql-7.4RC2/src/interfaces/jdbc/build.xml:114:
Old driver was detected on classpath or in jre/lib/ext, please remove and
try again.

Total time: 4 seconds
make[3]: *** [all] Error 1
make[2]: *** [all] Error 2
make[1]: *** [all] Error 2
make: *** [all] Error 2

I think I have deleted all the old postgresql.jar files. Any ideas? Or is
the jdbc driver no yet compatible with 7.4RC2?

(This is on MacOSX 10.2.8)

Thanks

adam


-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] cvs head? initdb?

2003-11-14 Thread Jan Wieck
Robert Treat wrote:

On Fri, 2003-11-14 at 10:32, Jan Wieck wrote:
Bruce Momjian wrote:
 Jan Wieck wrote:
 Christopher Browne wrote:
 
  [EMAIL PROTECTED] (elein) writes:
  What is the status of CVS head?  Isn't it in sync with 7.4.RC2?  I
  just upgraded from CVS and rebuilt clean and initdb now gives this
  lovely informative initdb failed message.
  
  No, I believe that they have started applying 7.5 patches, notably
  including reimplementing initdb in C.
  
  Jan checked in the ARC changes last night, and he and Tom ran into
  some problems, leading to that being pulled back out, while he
  revisits the code.
 
 Yeah, there was a problem with *extreme* sharing ... the code tried to 
 use the same buffer for multiple disk blocks at the same time, and 
 somehow the backends did not agree on the correct content. But it's 
 fixed and back in. You can see ARC working by setting
 
  buffer_strategy_status_interval = 10 # seconds
 
 and starting postmaster with -d1
 
 It is not on by default?
 

Sure not. Why would someone turn on debug messages by default?

Or did you mean ARC itself? Since it replaced the old LRU code, it is 
the only choice you have now. Which sort of raises the question if we 
would want to have multiple choices, like a config option

buffer_replacement_strategy = lru|lru2|arc

people would always want to have those choices (especially for doing
development/testing/benchmarking between the different methods) the
question is is it worth the effort to give people those options?
And in the case of the cache strategy, the point is that different 
access patterns might be served better by different strategies. Then 
again, who will really test this and try to tune ALL of them to find the 
best choice, and is this possible at all given that all databases under 
one postmaster share the same buffer pool?

Jan

--
#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #
---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


Re: [HACKERS] heads up -- subtle change of behavior of new initdb

2003-11-14 Thread Andrew Dunstan
Tom Lane wrote:

Greg Stark [EMAIL PROTECTED] writes:
 

I'm not suggesting making that the default setup, just loosening the
paranoia check so that if an admin sets the directory to be group
readable the database doesn't refuse to start up.
   

In previous discussions of this point, paranoia was considered desirable.
I don't think the situation has changed.
 

Would it be worth having a command line option to relax the paranoia a 
bit, leaving the current paranoia setting as the default? I guess it 
would have to be on the command line because IIRC this is checked before 
we ever look at the config file.

cheers

andrew

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] Background writer process

2003-11-14 Thread Bruce Momjian
Shridhar Daithankar wrote:
 On Friday 14 November 2003 03:05, Jan Wieck wrote:
  For sure the sync() needs to be replaced by the discussed fsync() of
  recently written files. And I think the algorithm how much and how often
  to flush can be significantly improved. But after all, this does not
  change the real checkpointing at all, and the general framework having a
  separate process is what we probably want.
 
 Having fsync for regular data files and sync for WAL segment a comfortable 
 compramise?  Or this is going to use fsync for all of them.

I think we still need sync() for WAL because sometimes backends are
going to have to write their own buffers, and we don't want them using
fsync or it will be very slow.

 IMO, with fsync, we tell kernel that you can write this buffer. It may or may 
 not write it immediately, unless it is hard sync. 
 
 Since postgresql can afford lazy writes for data files, I think this could 
 work.

fsync() doesn't return until the data is on the disk.  It doesn't
schedule the write then return, as far as I know.  sync() does schedule
the writes, I think, which can be bad, but we delay a little to wait for
it to complete.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


[HACKERS] Making new system catalog

2003-11-14 Thread kkim3
Dear,

I'd like to make new system catalog table.Could you let me know how to do
about that?


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


[HACKERS] INSERT extremely slow with large data sets

2003-11-14 Thread Slavisa Garic
Hi Everyone,

This is my first post here so please tell me to go somewhere else if this
is the wrong place to post questions like this.

I am using PostgreSQL 7.3.2 and have used earlier versions (7.1.x onwards)
and with all of them I noticed same problem with INSERTs when there is a
large data set. Just to so you guys can compare time it takes to insert
one row into a table when there are only few rows present and when there
are thousands:

Rows PresentStart Time  Finish Time

100 1068790804.12   1068790804.12
10001068790807.87   1068790807.87
50001068790839.26   1068790839.27
1   1068790909.24   1068790909.26
2   1068791172.82   1068791172.85
3   1068791664.06   1068791664.09 
4   1068792369.94   1068792370.0
5   1068793317.53   1068793317.6
6   1068794369.38   1068794369.47

As you can see if takes awfully lots of time for me just to have those
values inserted. Now to make a picture a bit clearer for you this table
has lots of information in there, about 25 columns. Also there are few
indexes that I created so that the process of selecting values from there
is faster which by the way works fine. Selecting anything takes under 5
seconds.

Any help would be greatly appreciated even pointing me in the right
direction where to ask this question. By the way I designed the database
this way as my application that uses PGSQL a lot during the execution so
there was a huge need for fast SELECTs. Our experiments are getting larger
and larger every day so fast inserts would be good as well.

Just to note those times above are of INSERTs only. Nothing else done that
would be included in those times. Machine was also free and that was the
only process running all the time and the machine was Intel(R) Pentium(R)
4 CPU 2.40GHz.

Regards,
Slavisa


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] Need help.

2003-11-14 Thread Petro Pelekh
I find such strange thing in my postgres server
---
distance= \d cities;
* QUERY **
SELECT c.oid,
  n.nspname,
  c.relname
FROM pg_catalog.pg_class c
 LEFT JOIN pg_catalog.pg_namespace n ON n.oid = c.relnamespace
WHERE pg_catalog.pg_table_is_visible(c.oid)
  AND c.relname ~ '^cities$'
ORDER BY 2, 3;
**

ERROR:  parser: parse error at or near .
---
distance= SELECT c.oid,
distance-   n.nspname,
distance-   c.relname
distance- FROM pg_catalog.pg_class c
distance-  LEFT JOIN pg_catalog.pg_namespace n ON n.oid =
c.relnamespace
distance- ;
ERROR:  parser: parse error at or near .
---
distance= SELECT c.oid,
distance-   c.relname
distance- FROM pg_class c;
  oid  | relname
---+-
  1247 | pg_type
  1249 | pg_attribute
 16523 | pg_stat_user_indexes
  1259 | pg_class
 16526 | pg_statio_all_indexes
  1261 | pg_group
  1262 | pg_database
   376 | pg_xactlock
 16529 | pg_statio_sys_indexes


So my database doesn't have pg_namespace system catalog, and doesn't
understand such pg_catalog.pg_class c, but understand pg_class c. May be
some of you have such problems with postgres and can help.

Thank you very much.


- Original Message -
From: Petro Pelekh [EMAIL PROTECTED]
Newsgroups: comp.databases.postgresql.hackers
Sent: Friday, November 14, 2003 10:36 AM
Subject: Need help.


 Good morning,

 I am new to Postgres, so excuse for such question, but I can't find it
 in dokumentation.
 I have table cities. I can insert into it, select from it, but cat run
such
 command
 distance= \d cities
 ERROR:  parser: parse error at or near .


 distance= \d cities;
 ERROR:  parser: parse error at or near .

 Can someone help me.

 Thank you very much



Petro Pelekh [EMAIL PROTECTED] wrote in message
news:[EMAIL PROTECTED]
 Good morning,

 I am new to Postgres, so excuse for such question, but I can't find it
 in dokumentation.
 I have table cities. I can insert into it, select from it, but cat run
such
 command
 distance= \d cities
 ERROR:  parser: parse error at or near .


 distance= \d cities;
 ERROR:  parser: parse error at or near .

 Can someone help me.

 Thank you very much





---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


[HACKERS] Need help.

2003-11-14 Thread Petro Pelekh
Good morning,

I am new to Postgres, so excuse for such question, but I can't find it
in dokumentation.
I have table cities. I can insert into it, select from it, but cat run such
command
distance= \d cities
ERROR:  parser: parse error at or near .


distance= \d cities;
ERROR:  parser: parse error at or near .

Can someone help me.

Thank you very much



---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] 7.4RC2 regression failur and not running stats collector process

2003-11-14 Thread Derek Morr
I think I have some more information on the statistics collector startup 
problem on Solaris.

I inserted the following into pgstat.c:

if (bind(pgStatSock, addr-ai_addr, addr-ai_addrlen)  0)
{
/* what type of socket are we trying to bind? */
fprintf(stderr, Address family is %d\n, 
addr-ai_addr-sa_family);
...
}
This returns a value of 26, which on Solaris is AF_INET6. But the 
machine I'm using (a V880 running 2.8) has no IPv6 address on any of its 
interfaces. And addr-ai_addr-sa_data is empty, so it's no surprise why 
 bind() is failing.

I'm not sure why Solaris is giving getaddrinfo_all an IPv6 address, though.

-derek





---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] cvs head? initdb?

2003-11-14 Thread Bruce Momjian
Jan Wieck wrote:
  Yeah, there was a problem with *extreme* sharing ... the code tried to 
  use the same buffer for multiple disk blocks at the same time, and 
  somehow the backends did not agree on the correct content. But it's 
  fixed and back in. You can see ARC working by setting
  
   buffer_strategy_status_interval = 10 # seconds

The above line is the one I was asking about?  Does that just control
debug output interval?  If so, it should be named more appropriately.

  
  and starting postmaster with -d1
  
  It is not on by default?
  
 
 Sure not. Why would someone turn on debug messages by default?
 
 Or did you mean ARC itself? Since it replaced the old LRU code, it is 
 the only choice you have now. Which sort of raises the question if we 
 would want to have multiple choices, like a config option
 
 buffer_replacement_strategy = lru|lru2|arc

No, I don't see much value to this and would encourage people to play
with something that probably is is of little value.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS] 7.4RC2 regression failur and not running stats collector process

2003-11-14 Thread Kurt Roeckx
On Thu, Nov 13, 2003 at 04:04:23PM -0500, Derek Morr wrote:
 
 the 
 machine I'm using (a V880 running 2.8) has no IPv6 address on any of its 
 interfaces.

So the for loop over the addresses that are returned should go
over both socket() and bind() instead of only socket().  And
probably connect() too.

The code now assumes if you create a socket of a certain type you
can actually use it.


Kurt


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS] Need help.

2003-11-14 Thread Alvaro Herrera
On Fri, Nov 14, 2003 at 02:04:56PM +0200, Petro Pelekh wrote:
 I find such strange thing in my postgres server
 ---
 distance= \d cities;
 ERROR:  parser: parse error at or near .

You are using psql from a 7.3 version to talk to an older server.  This
is not guaranteed to work.  It is recommended to upgrade the server, or
downgrade the tool.

Try
SELECT version();

and psql --version

-- 
Alvaro Herrera (alvherre[a]dcc.uchile.cl)
Just treat us the way you want to be treated + some extra allowance
for ignorance.(Michael Brusser)

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS] INSERT extremely slow with large data sets

2003-11-14 Thread Alvaro Herrera
On Fri, Nov 14, 2003 at 06:36:41PM +1100, Slavisa Garic wrote:

 Rows Present  Start Time  Finish Time
 
 100   1068790804.12   1068790804.12
 1000  1068790807.87   1068790807.87
 5000  1068790839.26   1068790839.27
 1 1068790909.24   1068790909.26
 2 1068791172.82   1068791172.85
 3 1068791664.06   1068791664.09 
 4 1068792369.94   1068792370.0
 5 1068793317.53   1068793317.6
 6 1068794369.38   1068794369.47

 [too slow]

Ok, so inserting 6 rows seems to take 0.09 seconds, and inserting
5000 takes only 0.01.  And your problem is exactly what?

-- 
Alvaro Herrera (alvherre[a]dcc.uchile.cl)
Coge la flor que hoy nace alegre, ufana. ¿Quién sabe si nacera otra mañana?

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] INSERT extremely slow with large data sets

2003-11-14 Thread Dann Corbit
 -Original Message-
 From: Slavisa Garic [mailto:[EMAIL PROTECTED] 
 Sent: Thursday, November 13, 2003 11:37 PM
 To: [EMAIL PROTECTED]
 Subject: [HACKERS] INSERT extremely slow with large data sets
 
 
 Hi Everyone,
 
 This is my first post here so please tell me to go somewhere 
 else if this is the wrong place to post questions like this.
 
 I am using PostgreSQL 7.3.2 and have used earlier versions 
 (7.1.x onwards) and with all of them I noticed same problem 
 with INSERTs when there is a large data set. Just to so you 
 guys can compare time it takes to insert one row into a table 
 when there are only few rows present and when there are thousands:
 
 Rows Present  Start Time  Finish Time
 
 100   1068790804.12   1068790804.12
 1000  1068790807.87   1068790807.87
 5000  1068790839.26   1068790839.27
 1 1068790909.24   1068790909.26
 2 1068791172.82   1068791172.85
 3 1068791664.06   1068791664.09 
 4 1068792369.94   1068792370.0
 5 1068793317.53   1068793317.6
 6 1068794369.38   1068794369.47
 
 As you can see if takes awfully lots of time for me just to 
 have those values inserted. Now to make a picture a bit 
 clearer for you this table has lots of information in there, 
 about 25 columns. Also there are few indexes that I created 
 so that the process of selecting values from there is faster 
 which by the way works fine. Selecting anything takes under 5 seconds.
 
 Any help would be greatly appreciated even pointing me in the 
 right direction where to ask this question. By the way I 
 designed the database this way as my application that uses 
 PGSQL a lot during the execution so there was a huge need for 
 fast SELECTs. Our experiments are getting larger and larger 
 every day so fast inserts would be good as well.
 
 Just to note those times above are of INSERTs only. Nothing 
 else done that would be included in those times. Machine was 
 also free and that was the only process running all the time 
 and the machine was Intel(R) Pentium(R) 4 CPU 2.40GHz.

You should post the schema for the table in question when you ask a
question like this.

The behavior is not surprising in the least bit.  Every database will
perform in this way, since you have mentioned that you have indexes on
the table.

The depth of the tree will be proportional to the log of the row count.
As the tree gets deeper, inserts will be more and more expensive.

If you have a giant pile of stuff to insert, consider the COPY command
or API if it is time critical.

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] cvs head? initdb?

2003-11-14 Thread Jan Wieck
Bruce Momjian wrote:

Jan Wieck wrote:
 Yeah, there was a problem with *extreme* sharing ... the code tried to 
 use the same buffer for multiple disk blocks at the same time, and 
 somehow the backends did not agree on the correct content. But it's 
 fixed and back in. You can see ARC working by setting
 
  buffer_strategy_status_interval = 10 # seconds
The above line is the one I was asking about?  Does that just control
debug output interval?  If so, it should be named more appropriately.
Yes, all it controls is the DEBUG1 output eveny N seconds. What is 
inappropriate about it? It is exactly that, it shows the sizes and 
recent hit rates for the 4 different CDB queues of the strategy. Have a 
better one?

The output is not really of any use yet. The background writer though 
will add a third line to this output, the number of clean buffers in 
front of the T1 and T2 queues. That will be the #1 tool for adjusting 
the background writer parameters so that it keeps the cache clean enough 
that backends don't need to write, and dirty enough that high frequently 
used blocks don't get written too often.


 
 and starting postmaster with -d1
 
 It is not on by default?
 

Sure not. Why would someone turn on debug messages by default?

Or did you mean ARC itself? Since it replaced the old LRU code, it is 
the only choice you have now. Which sort of raises the question if we 
would want to have multiple choices, like a config option

buffer_replacement_strategy = lru|lru2|arc
No, I don't see much value to this and would encourage people to play
with something that probably is is of little value.
Me neither, just wanted to have asked.

Jan

--
#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #
---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html


Re: [HACKERS] Need help.

2003-11-14 Thread Bruno Wolff III
On Fri, Nov 14, 2003 at 14:04:56 +0200,
  Petro Pelekh [EMAIL PROTECTED] wrote:
 
 So my database doesn't have pg_namespace system catalog, and doesn't
 understand such pg_catalog.pg_class c, but understand pg_class c. May be
 some of you have such problems with postgres and can help.

It looks like the client and the server aren't from the same version
of postgres.

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [HACKERS] Background writer process

2003-11-14 Thread Tom Lane
Bruce Momjian [EMAIL PROTECTED] writes:
 Shridhar Daithankar wrote:
 Having fsync for regular data files and sync for WAL segment a comfortable 
 compramise?  Or this is going to use fsync for all of them.

 I think we still need sync() for WAL because sometimes backends are
 going to have to write their own buffers, and we don't want them using
 fsync or it will be very slow.

sync() for WAL is a complete nonstarter, because it gives you no
guarantees at all about whether the write has occurred.  I don't really
care what you say about speed; this is a correctness point.

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] 7.4RC2 regression failur and not running stats collector process

2003-11-14 Thread Tom Lane
Kurt Roeckx [EMAIL PROTECTED] writes:
 So the for loop over the addresses that are returned should go
 over both socket() and bind() instead of only socket().  And
 probably connect() too.
 The code now assumes if you create a socket of a certain type you
 can actually use it.

Ah, light dawns... the postmaster socket code does this correctly,
but pgstat.c doesn't.

Too bad we didn't figure this out yesterday.  We are now in code freeze
for 7.4 release, and I'm hesitant to apply a fix for what is arguably a
broken platform.  Core guys, time for a vote ... do we fix, or hold this
for 7.4.1?

regards, tom lane

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] [PATCHES] ALTER TABLE modifications

2003-11-14 Thread Peter Eisentraut
Hannu Krosing writes:

  AFAICT, this patch does not buy us anything at all.  It's just a different
  spelling of existing functionality.  We have never done that before.

 what about DROP COLUMN - this is also just a different spelling for

 SELECT INTO, migrate all constraints, DROP OLD TABLE, RENAME.

No, because DROP COLUMN preserves dependent objects.

  We cannot possibly leave this patch as is.  People expect in-place column
  changes.

 Does SQL spec even require that SELECT * always returns columns in the
 same order ?

Yes:

b) Otherwise, the select list * is equivalent to a value
  expression sequence in which each value expression
  is a column reference that references a column of T and
  each column of T is referenced exactly once. The columns
  are referenced in the ascending sequence of their ordinal
  position within T.

 I don't think that relational model assigns any 'order' to columns.

Correct, but SQL is not the relational model or vice versa.

 BTW, SELECT * is just a different spelling of existing functionality ;)

No, there is no other way to get a complete list of columns.  (Hard-coding
does not count.)

-- 
Peter Eisentraut   [EMAIL PROTECTED]


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [CORE] [HACKERS] 7.4RC2 regression failur and not running stats collector process

2003-11-14 Thread Josh Berkus
Tom,

 Too bad we didn't figure this out yesterday.  We are now in code freeze
 for 7.4 release, and I'm hesitant to apply a fix for what is arguably a
 broken platform.  Core guys, time for a vote ... do we fix, or hold this
 for 7.4.1?

One thing I've not seen an answer to:  does Postgres run acceptably on other 
people's Solaris boxes?   If this bug is preventing running on Solaris at 
all, I'd say fix it ... Solaris is a major platform.   If it only affects 
users of one particular Solaris patch version, then we do a big warning and 
save it for 7.4.1.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] 7.4RC2 regression failur and not running stats collector

2003-11-14 Thread Bruce Momjian
Tom Lane wrote:
 Kurt Roeckx [EMAIL PROTECTED] writes:
  So the for loop over the addresses that are returned should go
  over both socket() and bind() instead of only socket().  And
  probably connect() too.
  The code now assumes if you create a socket of a certain type you
  can actually use it.
 
 Ah, light dawns... the postmaster socket code does this correctly,
 but pgstat.c doesn't.
 
 Too bad we didn't figure this out yesterday.  We are now in code freeze
 for 7.4 release, and I'm hesitant to apply a fix for what is arguably a
 broken platform.  Core guys, time for a vote ... do we fix, or hold this
 for 7.4.1?

Must fix, I believe, especially if it is the same function call sequence
used by the postmaster so we have a high probability it will work on all
platforms.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [CORE] [HACKERS] 7.4RC2 regression failur and not running stats collector process

2003-11-14 Thread Tom Lane
Josh Berkus [EMAIL PROTECTED] writes:
 One thing I've not seen an answer to:  does Postgres run acceptably on other 
 people's Solaris boxes?   If this bug is preventing running on Solaris at 
 all, I'd say fix it ... Solaris is a major platform.   If it only affects 
 users of one particular Solaris patch version, then we do a big warning and 
 save it for 7.4.1.

I'm sure it depends on which Solaris version you're running, and
possibly on local configuration issues as well.  We should not however
assume that the problem occurs *only* on Solaris.  My take on a lot of
the IPv6 funnies we've turned up is that they are kernel/userland
compatibility issues (v6-ready libc on non-v6-ready kernel or vice
versa), and that's surely at least as likely on Linux as Solaris.

The regression test that detects the problem was only put in as of
7.4beta3.  I'm not sure how many of our existing port reports were
based on prior betas ...

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS] 7.4RC2 regression failur and not running stats collector

2003-11-14 Thread P.J. \Josh\ Rovero
Solaris (5.7, 5.8, 5.9) on many different
workstation/server types is very important to us...
I agree with Bruce

Bruce Momjian wrote:

Must fix, I believe, especially if it is the same function call sequence
used by the postmaster so we have a high probability it will work on all
platforms.
--
P. J. Josh Rovero Sonalysts, Inc.
Email: [EMAIL PROTECTED]www.sonalysts.com215 Parkway North
Work: (860)326-3671 or 442-4355 Waterford CT 06385
***
---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] [ADMIN] Problem with compilation 7.3.4

2003-11-14 Thread Peter Eisentraut
Tom Lane writes:

  There is no info documentation, so you don't need this option.

 Someone was complaining about this just recently.  We don't need the
 option and don't have it implemented, but configure --help advertises
 it anyway (and fails to advertise the --docdir option that we do have).
 Can that be fixed?

In principle yes, but it's very unpractical.

The set of installation directory names and corresponding options in
Autoconf is fixed and cannot be changed.  This is in some cases annoying
and very frequently discussed on the Autoconf list, but it is in principle
a good thing.  Else, people would invent all kinds of options and there
would be absolutely no consistency between different packages.  Also,
automatic packaging tools from BSD ports to RPM expect a standard set of
options.

PostgreSQL currently contains a hack that does a macro substitution
s/info/doc/ to create a --docdir option, but that evidently does not catch
all cases.  Also, I'm getting reports that it breaks package builds
because they automatically provide an --infodir argument.

At this point I'm more than inclined to give up on that, put --infodir
back, and give the docdir a less attractive option like --with-docdir.
Additionally, we could add the oppositve sense --without-docdir to prevent
installation of the documentation, because some packaging tools, notably
RPM and Debian, like to install the documentation themselves.  (No doubt
this is originally founded in the lack of an appropriate option --docdir,
but it's too late to philosophize about that now.)

Comments?

-- 
Peter Eisentraut   [EMAIL PROTECTED]


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] [ADMIN] Problem with compilation 7.3.4

2003-11-14 Thread Tom Lane
Peter Eisentraut [EMAIL PROTECTED] writes:
 PostgreSQL currently contains a hack that does a macro substitution
 s/info/doc/ to create a --docdir option, but that evidently does not catch
 all cases.  Also, I'm getting reports that it breaks package builds
 because they automatically provide an --infodir argument.

 At this point I'm more than inclined to give up on that, put --infodir
 back, and give the docdir a less attractive option like --with-docdir.

I agree with the above.

 Additionally, we could add the oppositve sense --without-docdir to prevent
 installation of the documentation, because some packaging tools, notably
 RPM and Debian, like to install the documentation themselves.

This is probably something to hold for a later release.

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [HACKERS] [CORE] 7.4RC2 regression failur and not running stats collector process

2003-11-14 Thread Christopher Browne
[EMAIL PROTECTED] (Josh Berkus) writes:
 Too bad we didn't figure this out yesterday.  We are now in code freeze
 for 7.4 release, and I'm hesitant to apply a fix for what is arguably a
 broken platform.  Core guys, time for a vote ... do we fix, or hold this
 for 7.4.1?

 One thing I've not seen an answer to:  does Postgres run acceptably on other 
 people's Solaris boxes?   If this bug is preventing running on Solaris at 
 all, I'd say fix it ... Solaris is a major platform.   If it only affects 
 users of one particular Solaris patch version, then we do a big warning and 
 save it for 7.4.1.

For what it's worth, I have been running regression on Solaris with
numerous of the betas, and RC1 and [just now] RC2, with NO problems.

If the patch is deemed vital for others, it's possible that all I'm
reporting is one of the statistics that will be outnumbered by others.
(And in that case, I would be quick to test the patch to ensure it
causes no adverse side-effects.)

But it's not apparent that it is _vital_ here right now.
-- 
let name=cbbrowne and tld=libertyrms.info in name ^ @ ^ tld;;
http://dev6.int.libertyrms.com/
Christopher Browne
(416) 646 3304 x124 (land)

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] [CORE] 7.4RC2 regression failur and not running stats collector process

2003-11-14 Thread Tom Lane
Christopher Browne [EMAIL PROTECTED] writes:
 For what it's worth, I have been running regression on Solaris with
 numerous of the betas, and RC1 and [just now] RC2, with NO problems.

It seems clear that some Solaris installations are affected and some
are not.  Presumably there is some version difference or some local
configuration difference ... but since we don't know what the critical
factor is, we have no basis for guessing what fraction of Solaris
installations will see the problem.

 (And in that case, I would be quick to test the patch to ensure it
 causes no adverse side-effects.)

Here is the proposed patch --- please test it ASAP if you can.
This is against RC2.

regards, tom lane

*** src/backend/postmaster/pgstat.c.origFri Nov  7 16:55:50 2003
--- src/backend/postmaster/pgstat.c Fri Nov 14 15:02:14 2003
***
*** 203,208 
--- 203,216 
goto startup_failed;
}
  
+   /*
+* On some platforms, getaddrinfo_all() may return multiple addresses
+* only one of which will actually work (eg, both IPv6 and IPv4 addresses
+* when kernel will reject IPv6).  Worse, the failure may occur at the
+* bind() or perhaps even connect() stage.  So we must loop through the
+* results till we find a working combination.  We will generate LOG
+* messages, but no error, for bogus combinations.
+*/
for (addr = addrs; addr; addr = addr-ai_next)
{
  #ifdef HAVE_UNIX_SOCKETS
***
*** 210,262 
if (addr-ai_family == AF_UNIX)
continue;
  #endif
!   if ((pgStatSock = socket(addr-ai_family, SOCK_DGRAM, 0)) = 0)
!   break;
!   }
  
!   if (!addr || pgStatSock  0)
!   {
!   ereport(LOG,
!   (errcode_for_socket_access(),
!errmsg(could not create socket for statistics 
collector: %m)));
!   goto startup_failed;
!   }
  
!   /*
!* Bind it to a kernel assigned port on localhost and get the assigned
!* port via getsockname().
!*/
!   if (bind(pgStatSock, addr-ai_addr, addr-ai_addrlen)  0)
!   {
!   ereport(LOG,
!   (errcode_for_socket_access(),
!errmsg(could not bind socket for statistics 
collector: %m)));
!   goto startup_failed;
!   }
  
!   freeaddrinfo_all(hints.ai_family, addrs);
!   addrs = NULL;
  
!   alen = sizeof(pgStatAddr);
!   if (getsockname(pgStatSock, (struct sockaddr *)  pgStatAddr, alen)  0)
!   {
!   ereport(LOG,
!   (errcode_for_socket_access(),
! errmsg(could not get address of socket for statistics collector: 
%m)));
!   goto startup_failed;
}
  
!   /*
!* Connect the socket to its own address.  This saves a few cycles by
!* not having to respecify the target address on every send. This also
!* provides a kernel-level check that only packets from this same
!* address will be received.
!*/
!   if (connect(pgStatSock, (struct sockaddr *)  pgStatAddr, alen)  0)
{
ereport(LOG,
(errcode_for_socket_access(),
!errmsg(could not connect socket for statistics 
collector: %m)));
goto startup_failed;
}
  
--- 218,285 
if (addr-ai_family == AF_UNIX)
continue;
  #endif
!   /*
!* Create the socket.
!*/
!   if ((pgStatSock = socket(addr-ai_family, SOCK_DGRAM, 0))  0)
!   {
!   ereport(LOG,
!   (errcode_for_socket_access(),
!errmsg(could not create socket for 
statistics collector: %m)));
!   continue;
!   }
  
!   /*
!* Bind it to a kernel assigned port on localhost and get the assigned
!* port via getsockname().
!*/
!   if (bind(pgStatSock, addr-ai_addr, addr-ai_addrlen)  0)
!   {
!   ereport(LOG,
!   (errcode_for_socket_access(),
!errmsg(could not bind socket for statistics 
collector: %m)));
!   closesocket(pgStatSock);
!   pgStatSock = -1;
!   continue;
!   }
  
!   alen = sizeof(pgStatAddr);
!   if (getsockname(pgStatSock, (struct sockaddr *) pgStatAddr, alen)  
0)
!   {
!   ereport(LOG,
!   (errcode_for_socket_access(),
!  

Re: [HACKERS] [CORE] 7.4RC2 regression failur and not running stats

2003-11-14 Thread Joshua D. Drake
I can fire up our solaris machine and let you have access to it if you 
want to do some destructive testing.

Tom Lane wrote:

Christopher Browne [EMAIL PROTECTED] writes:
 

For what it's worth, I have been running regression on Solaris with
numerous of the betas, and RC1 and [just now] RC2, with NO problems.
   

It seems clear that some Solaris installations are affected and some
are not.  Presumably there is some version difference or some local
configuration difference ... but since we don't know what the critical
factor is, we have no basis for guessing what fraction of Solaris
installations will see the problem.
 

(And in that case, I would be quick to test the patch to ensure it
causes no adverse side-effects.)
   

Here is the proposed patch --- please test it ASAP if you can.
This is against RC2.
			regards, tom lane

 



*** src/backend/postmaster/pgstat.c.orig	Fri Nov  7 16:55:50 2003
--- src/backend/postmaster/pgstat.c	Fri Nov 14 15:02:14 2003
***
*** 203,208 
--- 203,216 
 		goto startup_failed;
 	}
 
+ 	/*
+ 	 * On some platforms, getaddrinfo_all() may return multiple addresses
+ 	 * only one of which will actually work (eg, both IPv6 and IPv4 addresses
+ 	 * when kernel will reject IPv6).  Worse, the failure may occur at the
+ 	 * bind() or perhaps even connect() stage.  So we must loop through the
+ 	 * results till we find a working combination.  We will generate LOG
+ 	 * messages, but no error, for bogus combinations.
+ 	 */
 	for (addr = addrs; addr; addr = addr-ai_next)
 	{
 #ifdef HAVE_UNIX_SOCKETS
***
*** 210,262 
 		if (addr-ai_family == AF_UNIX)
 			continue;
 #endif
! 		if ((pgStatSock = socket(addr-ai_family, SOCK_DGRAM, 0)) = 0)
! 			break;
! 	}
 
! 	if (!addr || pgStatSock  0)
! 	{
! 		ereport(LOG,
! (errcode_for_socket_access(),
!  errmsg(could not create socket for statistics collector: %m)));
! 		goto startup_failed;
! 	}
 
! 	/*
! 	 * Bind it to a kernel assigned port on localhost and get the assigned
! 	 * port via getsockname().
! 	 */
! 	if (bind(pgStatSock, addr-ai_addr, addr-ai_addrlen)  0)
! 	{
! 		ereport(LOG,
! (errcode_for_socket_access(),
!  errmsg(could not bind socket for statistics collector: %m)));
! 		goto startup_failed;
! 	}
 
! 	freeaddrinfo_all(hints.ai_family, addrs);
! 	addrs = NULL;
 
! 	alen = sizeof(pgStatAddr);
! 	if (getsockname(pgStatSock, (struct sockaddr *)  pgStatAddr, alen)  0)
! 	{
! 		ereport(LOG,
! (errcode_for_socket_access(),
! 		  errmsg(could not get address of socket for statistics collector: %m)));
! 		goto startup_failed;
 	}
 
! 	/*
! 	 * Connect the socket to its own address.  This saves a few cycles by
! 	 * not having to respecify the target address on every send. This also
! 	 * provides a kernel-level check that only packets from this same
! 	 * address will be received.
! 	 */
! 	if (connect(pgStatSock, (struct sockaddr *)  pgStatAddr, alen)  0)
 	{
 		ereport(LOG,
 (errcode_for_socket_access(),
!  errmsg(could not connect socket for statistics collector: %m)));
 		goto startup_failed;
 	}
 
--- 218,285 
 		if (addr-ai_family == AF_UNIX)
 			continue;
 #endif
! 		/*
! 		 * Create the socket.
! 		 */
! 		if ((pgStatSock = socket(addr-ai_family, SOCK_DGRAM, 0))  0)
! 		{
! 			ereport(LOG,
! 	(errcode_for_socket_access(),
! 	 errmsg(could not create socket for statistics collector: %m)));
! 			continue;
! 		}
 
! 		/*
! 		 * Bind it to a kernel assigned port on localhost and get the assigned
! 		 * port via getsockname().
! 		 */
! 		if (bind(pgStatSock, addr-ai_addr, addr-ai_addrlen)  0)
! 		{
! 			ereport(LOG,
! 	(errcode_for_socket_access(),
! 	 errmsg(could not bind socket for statistics collector: %m)));
! 			closesocket(pgStatSock);
! 			pgStatSock = -1;
! 			continue;
! 		}
 
! 		alen = sizeof(pgStatAddr);
! 		if (getsockname(pgStatSock, (struct sockaddr *) pgStatAddr, alen)  0)
! 		{
! 			ereport(LOG,
! 	(errcode_for_socket_access(),
! 	 errmsg(could not get address of socket for statistics collector: %m)));
! 			closesocket(pgStatSock);
! 			pgStatSock = -1;
! 			continue;
! 		}
 
! 		/*
! 		 * Connect the socket to its own address.  This saves a few cycles by
! 		 * not having to respecify the target address on every send. This also
! 		 * provides a kernel-level check that only packets from this same
! 		 * address will be received.
! 		 */
! 		if (connect(pgStatSock, (struct sockaddr *) pgStatAddr, alen)  0)
! 		{
! 			ereport(LOG,
! 	(errcode_for_socket_access(),
! 	 errmsg(could not connect socket for statistics collector: %m)));
! 			closesocket(pgStatSock);
! 			pgStatSock = -1;
! 			continue;
! 		}
 
! 		/* If we get here, we have a working socket */
! 		break;
 	}
 
! 	/* Did we find a working address? */
! 	if (!addr || pgStatSock  0)
 	{
 		ereport(LOG,
 (errcode_for_socket_access(),
!  errmsg(disabling statistics collector for 

Re: [HACKERS] [CORE] 7.4RC2 regression failur and not running stats collector process

2003-11-14 Thread Glenn Wiorek
Hmm I know it's been a while  since I used patch but I seem to be having
problems applying it.  Perhaps my patch is outdated??

patch -b pgstat.c   patchfile
Looks like a new-style context diff.
Hunk#2failed at line 203.
Hunk#2failed at line 210.
Hunk#3failed at line 284.
3 out of 3 hunks ailed: saving reject to pgstat.c.rej

- Original Message - 
From: Tom Lane [EMAIL PROTECTED]
To: Christopher Browne [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Sent: Friday, November 14, 2003 2:42 PM
Subject: Re: [HACKERS] [CORE] 7.4RC2 regression failur and not running stats
collector process


 Christopher Browne [EMAIL PROTECTED] writes:
  For what it's worth, I have been running regression on Solaris with
  numerous of the betas, and RC1 and [just now] RC2, with NO problems.

 It seems clear that some Solaris installations are affected and some
 are not.  Presumably there is some version difference or some local
 configuration difference ... but since we don't know what the critical
 factor is, we have no basis for guessing what fraction of Solaris
 installations will see the problem.

  (And in that case, I would be quick to test the patch to ensure it
  causes no adverse side-effects.)

 Here is the proposed patch --- please test it ASAP if you can.
 This is against RC2.

 regards, tom lane








 *** src/backend/postmaster/pgstat.c.orig Fri Nov  7 16:55:50 2003
 --- src/backend/postmaster/pgstat.c Fri Nov 14 15:02:14 2003
 ***
 *** 203,208 
 --- 203,216 
   goto startup_failed;
   }

 + /*
 + * On some platforms, getaddrinfo_all() may return multiple addresses
 + * only one of which will actually work (eg, both IPv6 and IPv4 addresses
 + * when kernel will reject IPv6).  Worse, the failure may occur at the
 + * bind() or perhaps even connect() stage.  So we must loop through the
 + * results till we find a working combination.  We will generate LOG
 + * messages, but no error, for bogus combinations.
 + */
   for (addr = addrs; addr; addr = addr-ai_next)
   {
   #ifdef HAVE_UNIX_SOCKETS
 ***
 *** 210,262 
   if (addr-ai_family == AF_UNIX)
   continue;
   #endif
 ! if ((pgStatSock = socket(addr-ai_family, SOCK_DGRAM, 0)) = 0)
 ! break;
 ! }

 ! if (!addr || pgStatSock  0)
 ! {
 ! ereport(LOG,
 ! (errcode_for_socket_access(),
 ! errmsg(could not create socket for statistics collector: %m)));
 ! goto startup_failed;
 ! }

 ! /*
 ! * Bind it to a kernel assigned port on localhost and get the assigned
 ! * port via getsockname().
 ! */
 ! if (bind(pgStatSock, addr-ai_addr, addr-ai_addrlen)  0)
 ! {
 ! ereport(LOG,
 ! (errcode_for_socket_access(),
 ! errmsg(could not bind socket for statistics collector: %m)));
 ! goto startup_failed;
 ! }

 ! freeaddrinfo_all(hints.ai_family, addrs);
 ! addrs = NULL;

 ! alen = sizeof(pgStatAddr);
 ! if (getsockname(pgStatSock, (struct sockaddr *)  pgStatAddr, alen) 
0)
 ! {
 ! ereport(LOG,
 ! (errcode_for_socket_access(),
 !   errmsg(could not get address of socket for statistics collector:
%m)));
 ! goto startup_failed;
   }

 ! /*
 ! * Connect the socket to its own address.  This saves a few cycles by
 ! * not having to respecify the target address on every send. This also
 ! * provides a kernel-level check that only packets from this same
 ! * address will be received.
 ! */
 ! if (connect(pgStatSock, (struct sockaddr *)  pgStatAddr, alen)  0)
   {
   ereport(LOG,
   (errcode_for_socket_access(),
 ! errmsg(could not connect socket for statistics collector: %m)));
   goto startup_failed;
   }

 --- 218,285 
   if (addr-ai_family == AF_UNIX)
   continue;
   #endif
 ! /*
 ! * Create the socket.
 ! */
 ! if ((pgStatSock = socket(addr-ai_family, SOCK_DGRAM, 0))  0)
 ! {
 ! ereport(LOG,
 ! (errcode_for_socket_access(),
 ! errmsg(could not create socket for statistics collector: %m)));
 ! continue;
 ! }

 ! /*
 ! * Bind it to a kernel assigned port on localhost and get the assigned
 ! * port via getsockname().
 ! */
 ! if (bind(pgStatSock, addr-ai_addr, addr-ai_addrlen)  0)
 ! {
 ! ereport(LOG,
 ! (errcode_for_socket_access(),
 ! errmsg(could not bind socket for statistics collector: %m)));
 ! closesocket(pgStatSock);
 ! pgStatSock = -1;
 ! continue;
 ! }

 ! alen = sizeof(pgStatAddr);
 ! if (getsockname(pgStatSock, (struct sockaddr *) pgStatAddr, alen)  0)
 ! {
 ! ereport(LOG,
 ! (errcode_for_socket_access(),
 ! errmsg(could not get address of socket for statistics collector:
%m)));
 ! closesocket(pgStatSock);
 ! pgStatSock = -1;
 ! continue;
 ! }

 ! /*
 ! * Connect the socket to its own address.  This saves a few cycles by
 ! * not having to respecify the target address on every send. This also
 ! * provides a kernel-level check that only packets from this same
 ! * address will be received.
 ! */
 ! if (connect(pgStatSock, (struct sockaddr *) pgStatAddr, alen)  0)
 ! {
 ! ereport(LOG,
 ! (errcode_for_socket_access(),
 ! errmsg(could not connect socket for 

Re: [HACKERS] cvs head? initdb?

2003-11-14 Thread Robert Treat
On Friday 14 November 2003 12:03, Jan Wieck wrote:
 Robert Treat wrote:
  On Fri, 2003-11-14 at 10:32, Jan Wieck wrote:
 
  Or did you mean ARC itself? Since it replaced the old LRU code, it is
  the only choice you have now. Which sort of raises the question if we
  would want to have multiple choices, like a config option
 
  buffer_replacement_strategy = lru|lru2|arc
 
  people would always want to have those choices (especially for doing
  development/testing/benchmarking between the different methods) the
  question is is it worth the effort to give people those options?

 And in the case of the cache strategy, the point is that different
 access patterns might be served better by different strategies. Then
 again, who will really test this and try to tune ALL of them to find the
 best choice, and is this possible at all given that all databases under
 one postmaster share the same buffer pool?


I could see people like the OSDB folks or some of the folks on -performance at 
least doing some testing against the different backends. Probably not 
extensive but I bet enough to see if there is a clear winner for some types 
of work.  You might not be able to test them in parallel, but certainly you 
could serially. 

Robert Treat
-- 
Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


[HACKERS] oh dear ...

2003-11-14 Thread Tom Lane
This worked in 7.3:

regression=# select '1999-jan-08'::date;
ERROR:  date/time field value out of range: 1999-jan-08
HINT:  Perhaps you need a different datestyle setting.

Setting DateStyle to YMD doesn't help, and in any case I'd think that
this ought to be considered an unambiguous input format.

The variants
select 'jan-08-1999'::date;
select '08-jan-1999'::date;
both still work, so I think this is probably some small oversight in the
logic, but I haven't dug into it to find where.

Not sure if this qualifies as a must-fix-for-7.4 or not, but my vote
would be yes ...

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [HACKERS] oh dear ...

2003-11-14 Thread Tom Lane
I said:
 This worked in 7.3:
 regression=# select '1999-jan-08'::date;
 ERROR:  date/time field value out of range: 1999-jan-08
 HINT:  Perhaps you need a different datestyle setting.

 Setting DateStyle to YMD doesn't help, and in any case I'd think that
 this ought to be considered an unambiguous input format.

This appears to be an oversight in the portions of the datetime code
that we recently changed to enforce DateStyle more tightly.
Specifically, DecodeNumber was rewritten without realizing that it was
invoked in a special way when a textual month name appears in the input.
DecodeDate actually makes two passes over the input, noting the textual
month name in the first pass, and then calling DecodeNumber on only the
numeric fields in the second pass.  This means that when DecodeNumber is
called for the first time, the MONTH flag may already be set.  The
rewrite mistakenly assumed that in this case we must be at the second
field of an MM-DD-YY-order input.

I propose the attached patch to fix the problem.  It doesn't break any
regression tests, and it appears to fix the cases noted in its comment.

Opinions on whether to apply this to 7.4?

regards, tom lane

*** src/backend/utils/adt/datetime.c.orig   Thu Sep 25 10:23:13 2003
--- src/backend/utils/adt/datetime.cFri Nov 14 19:22:47 2003
***
*** 2553,2561 
break;
  
case (DTK_M(MONTH)):
!   /* Must be at second field of MM-DD-YY */
!   *tmask = DTK_M(DAY);
!   tm-tm_mday = val;
break;
  
case (DTK_M(YEAR) | DTK_M(MONTH) | DTK_M(DAY)):
--- 2553,2577 
break;
  
case (DTK_M(MONTH)):
!   /*
!* There are two possibilities: we are at second field of
!* MM-DD-YY (with DateOrder MDY), or we are at the first
!* numeric field of a date that included a textual month name.
!* We want to support the variants MON-DD-, DD-MON-,
!* and -MON-DD as unambiguous inputs.  We will also accept
!* MON-DD-YY or DD-MON-YY in either DMY or MDY modes, as well
!* as YY-MON-DD in YMD mode.  Hence:
!*/
!   if (flen = 3 || DateOrder == DATEORDER_YMD)
!   {
!   *tmask = DTK_M(YEAR);
!   tm-tm_year = val;
!   }
!   else
!   {
!   *tmask = DTK_M(DAY);
!   tm-tm_mday = val;
!   }
break;
  
case (DTK_M(YEAR) | DTK_M(MONTH) | DTK_M(DAY)):

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [CORE] [HACKERS] 7.4RC2 regression failur and not running stats

2003-11-14 Thread Marc G. Fournier


On Fri, 14 Nov 2003, Josh Berkus wrote:

 Tom,

  Too bad we didn't figure this out yesterday.  We are now in code freeze
  for 7.4 release, and I'm hesitant to apply a fix for what is arguably a
  broken platform.  Core guys, time for a vote ... do we fix, or hold this
  for 7.4.1?

 One thing I've not seen an answer to:  does Postgres run acceptably on other
 people's Solaris boxes?   If this bug is preventing running on Solaris at
 all, I'd say fix it ... Solaris is a major platform.   If it only affects
 users of one particular Solaris patch version, then we do a big warning and
 save it for 7.4.1.

I agree with Josh on this ...


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS] INSERT extremely slow with large data sets

2003-11-14 Thread Slavisa Garic


On Fri, 14 Nov 2003, Alvaro Herrera wrote:

 On Fri, Nov 14, 2003 at 06:36:41PM +1100, Slavisa Garic wrote:
 
  Rows PresentStart Time  Finish Time
  
  100 1068790804.12   1068790804.12
  10001068790807.87   1068790807.87
  50001068790839.26   1068790839.27
  1   1068790909.24   1068790909.26
  2   1068791172.82   1068791172.85
  3   1068791664.06   1068791664.09 
  4   1068792369.94   1068792370.0
  5   1068793317.53   1068793317.6
  6   1068794369.38   1068794369.47
 
  [too slow]
 
 Ok, so inserting 6 rows seems to take 0.09 seconds, and inserting
 5000 takes only 0.01.  And your problem is exactly what?

You didn't understand the question. Inserting ONE ROW when there are already
5000 ROWS present takes 0.01 seconds. Inserting ONE ROW when there are
already 6 ROWS present takes 0.09 secods. In other words in takes
about 9 times more time to insert ONE ROW when there is a larger set of
data already in the database. As my experiments will grow and more data
will be inserted this is getting to take too long. Inserting 7 rows
takes about just over an hour. INserting 5000 takes about minute and an
half.

I don't know if this the behaviour to be expected so that is why i posted
the question and that is my problem. I also wanted to know what can be
done to improve this if it can be,

Regards,
Slavisa
 
 -- 
 Alvaro Herrera (alvherre[a]dcc.uchile.cl)
 Coge la flor que hoy nace alegre, ufana. ¿Quién sabe si nacera otra mañana?
 


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] INSERT extremely slow with large data sets

2003-11-14 Thread Slavisa Garic
Hi Dann

Here is the schema and also could you just be more specific on COPY
command. ALso does talking dirrectly to API speed things up ? (I am new to
databases but i am learning quickly) 

-- NimrodEnfJob --
 
create table NimrodEnfJob(
exp_id  INTEGER not null references NimrodEnfExperiment,
task_id INTEGER not null references NimrodTask,
pgroup_id   INTEGER not null references NimrodParameterGroup,
agent_idINTEGER references NimrodAgent on delete set null,
jobname varchar(255) not null,
admin   char(1) not null default 'F'
check (admin in ('F', 'T')),
tasknamevarchar(255) not null,
param_text  TEXT not null,
openchar(1) not null default 'F'
check (open in ('F', 'T')),
control varchar(8) not null default 'start'
check (control in ('nostart', 'start', 'stop')),
status  varchar(16) not null default 'ready'
check (status in ('ready', 'executing', 'failed',
'done')),
cpulength   real not null,
sleeptime   real not null,
filesizereal not null,
cputime   real,
waittimereal,
filetimereal,
filebytes   integer,
priorityinteger not null default 100,
create_time timestamp not null default CURRENT_TIMESTAMP,
start_time timestamp,
finish_time timestamp,
budget  real not null default 0.0,
servername  varchar(255),
error_info  varchar(255) not null default '',
more_info   TEXT not null default '',
primary key (exp_id, jobname),
foreign key (exp_id, taskname) references NimrodEnfTask
);

Also these are the indexes on this table. I created them on the columnt
that are most commonly accessed:
create unique index nej_idx
ON NimrodEnfJob (exp_id, pgroup_id);
 
create unique index nej_idx1
ON NimrodEnfJob (pgroup_id);
 
create index nej_idx2
ON NimrodEnfJob (status);
 
create unique index nej_idx3
ON NimrodEnfJob (status, pgroup_id);
 
create index nej_idx4
ON NimrodEnfJob (status, agent_id);
 
create index nej_idx5
ON NimrodEnfJob (agent_id);

I did notice that removing those indexes doesn't import by much. Similar
behaviour is observed but it just takes a bit less time to insert (0.01
less then usually at 6 records) 

Regards,
Slavisa

On Fri, 14 Nov 2003, Dann Corbit
wrote:

  -Original Message-
  From: Slavisa Garic [mailto:[EMAIL PROTECTED] 
  Sent: Thursday, November 13, 2003 11:37 PM
  To: [EMAIL PROTECTED]
  Subject: [HACKERS] INSERT extremely slow with large data sets
  
  
  Hi Everyone,
  
  This is my first post here so please tell me to go somewhere 
  else if this is the wrong place to post questions like this.
  
  I am using PostgreSQL 7.3.2 and have used earlier versions 
  (7.1.x onwards) and with all of them I noticed same problem 
  with INSERTs when there is a large data set. Just to so you 
  guys can compare time it takes to insert one row into a table 
  when there are only few rows present and when there are thousands:
  
  Rows PresentStart Time  Finish Time
  
  100 1068790804.12   1068790804.12
  10001068790807.87   1068790807.87
  50001068790839.26   1068790839.27
  1   1068790909.24   1068790909.26
  2   1068791172.82   1068791172.85
  3   1068791664.06   1068791664.09 
  4   1068792369.94   1068792370.0
  5   1068793317.53   1068793317.6
  6   1068794369.38   1068794369.47
  
  As you can see if takes awfully lots of time for me just to 
  have those values inserted. Now to make a picture a bit 
  clearer for you this table has lots of information in there, 
  about 25 columns. Also there are few indexes that I created 
  so that the process of selecting values from there is faster 
  which by the way works fine. Selecting anything takes under 5 seconds.
  
  Any help would be greatly appreciated even pointing me in the 
  right direction where to ask this question. By the way I 
  designed the database this way as my application that uses 
  PGSQL a lot during the execution so there was a huge need for 
  fast SELECTs. Our experiments are getting larger and larger 
  every day so fast inserts would be good as well.
  
  Just to note those times above are of INSERTs only. Nothing 
  else done that would be included in those times. Machine 

Re: [HACKERS] oh dear ...

2003-11-14 Thread Marc G. Fournier


On Fri, 14 Nov 2003, Tom Lane wrote:

 I said:
  This worked in 7.3:
  regression=# select '1999-jan-08'::date;
  ERROR:  date/time field value out of range: 1999-jan-08
  HINT:  Perhaps you need a different datestyle setting.

  Setting DateStyle to YMD doesn't help, and in any case I'd think that
  this ought to be considered an unambiguous input format.

 This appears to be an oversight in the portions of the datetime code
 that we recently changed to enforce DateStyle more tightly.
 Specifically, DecodeNumber was rewritten without realizing that it was
 invoked in a special way when a textual month name appears in the input.
 DecodeDate actually makes two passes over the input, noting the textual
 month name in the first pass, and then calling DecodeNumber on only the
 numeric fields in the second pass.  This means that when DecodeNumber is
 called for the first time, the MONTH flag may already be set.  The
 rewrite mistakenly assumed that in this case we must be at the second
 field of an MM-DD-YY-order input.

 I propose the attached patch to fix the problem.  It doesn't break any
 regression tests, and it appears to fix the cases noted in its comment.

 Opinions on whether to apply this to 7.4?

based on ought to be considered an unambiguous input format, I'd say
leave it for 7.4.1 ...


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] oh dear ...

2003-11-14 Thread Bruce Momjian
Tom Lane wrote:
 I said:
  This worked in 7.3:
  regression=# select '1999-jan-08'::date;
  ERROR:  date/time field value out of range: 1999-jan-08
  HINT:  Perhaps you need a different datestyle setting.
 
  Setting DateStyle to YMD doesn't help, and in any case I'd think that
  this ought to be considered an unambiguous input format.
 
 This appears to be an oversight in the portions of the datetime code
 that we recently changed to enforce DateStyle more tightly.
 Specifically, DecodeNumber was rewritten without realizing that it was
 invoked in a special way when a textual month name appears in the input.
 DecodeDate actually makes two passes over the input, noting the textual
 month name in the first pass, and then calling DecodeNumber on only the
 numeric fields in the second pass.  This means that when DecodeNumber is
 called for the first time, the MONTH flag may already be set.  The
 rewrite mistakenly assumed that in this case we must be at the second
 field of an MM-DD-YY-order input.
 
 I propose the attached patch to fix the problem.  It doesn't break any
 regression tests, and it appears to fix the cases noted in its comment.
 
 Opinions on whether to apply this to 7.4?

I guess the question is whether we would fix this in a minor release,
and I think the answer it yes, so we can fix it now.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] oh dear ...

2003-11-14 Thread Marc G. Fournier


On Fri, 14 Nov 2003, Bruce Momjian wrote:

 Tom Lane wrote:
  I said:
   This worked in 7.3:
   regression=# select '1999-jan-08'::date;
   ERROR:  date/time field value out of range: 1999-jan-08
   HINT:  Perhaps you need a different datestyle setting.
 
   Setting DateStyle to YMD doesn't help, and in any case I'd think that
   this ought to be considered an unambiguous input format.
 
  This appears to be an oversight in the portions of the datetime code
  that we recently changed to enforce DateStyle more tightly.
  Specifically, DecodeNumber was rewritten without realizing that it was
  invoked in a special way when a textual month name appears in the input.
  DecodeDate actually makes two passes over the input, noting the textual
  month name in the first pass, and then calling DecodeNumber on only the
  numeric fields in the second pass.  This means that when DecodeNumber is
  called for the first time, the MONTH flag may already be set.  The
  rewrite mistakenly assumed that in this case we must be at the second
  field of an MM-DD-YY-order input.
 
  I propose the attached patch to fix the problem.  It doesn't break any
  regression tests, and it appears to fix the cases noted in its comment.
 
  Opinions on whether to apply this to 7.4?

 I guess the question is whether we would fix this in a minor release,
 and I think the answer it yes, so we can fix it now.

Ah, so we attempt to fix a bug that affects what appears to be a small %
of configurations with quick testing and with the greater possibility of
affecting a larger % of configurations ... instead of releasing what we
has been reported as being stable on the large % of configurations, and
fixing it for that small % of configuratiosn in a minor release?

Sounds to me like a decision design to benefit the few at the risk of the
many ... when documenting the known bug for those few would be safer ...


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] oh dear ...

2003-11-14 Thread Andrew Dunstan


Marc G. Fournier wrote:

On Fri, 14 Nov 2003, Bruce Momjian wrote:

 

Tom Lane wrote:
   

I said:
 

This worked in 7.3:
regression=# select '1999-jan-08'::date;
ERROR:  date/time field value out of range: 1999-jan-08
HINT:  Perhaps you need a different datestyle setting.
   

Setting DateStyle to YMD doesn't help, and in any case I'd think that
this ought to be considered an unambiguous input format.
   

This appears to be an oversight in the portions of the datetime code
that we recently changed to enforce DateStyle more tightly.
Specifically, DecodeNumber was rewritten without realizing that it was
invoked in a special way when a textual month name appears in the input.
DecodeDate actually makes two passes over the input, noting the textual
month name in the first pass, and then calling DecodeNumber on only the
numeric fields in the second pass.  This means that when DecodeNumber is
called for the first time, the MONTH flag may already be set.  The
rewrite mistakenly assumed that in this case we must be at the second
field of an MM-DD-YY-order input.
I propose the attached patch to fix the problem.  It doesn't break any
regression tests, and it appears to fix the cases noted in its comment.
Opinions on whether to apply this to 7.4?
 

I guess the question is whether we would fix this in a minor release,
and I think the answer it yes, so we can fix it now.
   

Ah, so we attempt to fix a bug that affects what appears to be a small %
of configurations with quick testing and with the greater possibility of
affecting a larger % of configurations ... instead of releasing what we
has been reported as being stable on the large % of configurations, and
fixing it for that small % of configuratiosn in a minor release?
Sounds to me like a decision design to benefit the few at the risk of the
many ... when documenting the known bug for those few would be safer ...
 

I'm confused. My understanding from what Tom said is that it affects all 
configurations.

cheers

andrew

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] INSERT extremely slow with large data sets

2003-11-14 Thread Dann Corbit
 -Original Message-
 From: Slavisa Garic [mailto:[EMAIL PROTECTED] 
 Sent: Friday, November 14, 2003 5:12 PM
 To: Dann Corbit
 Cc: Slavisa Garic; [EMAIL PROTECTED]
 Subject: Re: [HACKERS] INSERT extremely slow with large data sets
 
 
 Hi Dann
 
 Here is the schema and also could you just be more specific 
 on COPY command.

http://www.postgresql.org/docs/7.3/static/sql-copy.html

And

http://techdocs.postgresql.org/techdocs/usingcopy.php

May be helpful.

 ALso does talking dirrectly to API speed 
 things up ? (I am new to databases but i am learning quickly) 

Not particularly.  This is the copy command API:
http://www.postgresql.org/docs/7.3/static/libpq-copy.html

What the API can allow you to do (for instance) would be to never let
the data touch the ground.  Instead of writing to a text file or even a
binary format copy input file, you use the API to take the incoming data
and insert it directly.

Like everything else, there is a dark side.  Read the documents and they
will explain it.  But if you need to move a giant pile of data into the
database as fast as possible, it is the copy command that is the most
efficient.
 
   -- NimrodEnfJob --
  
   create table NimrodEnfJob(
   exp_id  INTEGER not null references NimrodEnfExperiment,
   task_id INTEGER not null references NimrodTask,
   pgroup_id   INTEGER not null references 
 NimrodParameterGroup,
   agent_idINTEGER references NimrodAgent on 
 delete set null,
   jobname varchar(255) not null,
   admin   char(1) not null default 'F'
   check (admin in ('F', 'T')),
   tasknamevarchar(255) not null,
   param_text  TEXT not null,
   openchar(1) not null default 'F'
   check (open in ('F', 'T')),
   control varchar(8) not null default 'start'
   check (control in ('nostart', 'start', 'stop')),
   status  varchar(16) not null default 'ready'
   check (status in ('ready', 'executing', 'failed',
   'done')),
   cpulength   real not null,
   sleeptime   real not null,
   filesizereal not null,
   cputime   real,
   waittimereal,
   filetimereal,
   filebytes   integer,
   priorityinteger not null default 100,
   create_time timestamp not null default CURRENT_TIMESTAMP,
   start_time timestamp,
   finish_time timestamp,
   budget  real not null default 0.0,
   servername  varchar(255),
   error_info  varchar(255) not null default '',
   more_info   TEXT not null default '',
   primary key (exp_id, jobname),
   foreign key (exp_id, taskname) references NimrodEnfTask
   );
 
 Also these are the indexes on this table. I created them on 
 the columnt that are most commonly accessed:
   create unique index nej_idx
   ON NimrodEnfJob (exp_id, pgroup_id);
  
   create unique index nej_idx1
   ON NimrodEnfJob (pgroup_id);
  
   create index nej_idx2
   ON NimrodEnfJob (status);
  
   create unique index nej_idx3
   ON NimrodEnfJob (status, pgroup_id);
  
   create index nej_idx4
   ON NimrodEnfJob (status, agent_id);
  
   create index nej_idx5
   ON NimrodEnfJob (agent_id);
 
 I did notice that removing those indexes doesn't import by 
 much. Similar behaviour is observed but it just takes a bit 
 less time to insert (0.01 less then usually at 6 records) 

I am quite surprised that removing the indexes does not have a large
impact on insert speed, especially, since you have 6 of them.  Most of
the other costs that I can think of are fixed for inserts into a bare
table.  Perhaps someone with more intimate knowledge of the inner
working may know why inserts into a table without any index will trail
off in speed as the table grows.


[snip]

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS] oh dear ...

2003-11-14 Thread Marc G. Fournier


On Fri, 14 Nov 2003, Andrew Dunstan wrote:

 I'm confused. My understanding from what Tom said is that it affects all
 configurations.

the stats collector problem, from what I've seen through this list,
affects Solaris, and only some Solaris configuration ..


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [HACKERS] oh dear ...

2003-11-14 Thread Tom Lane
Marc G. Fournier [EMAIL PROTECTED] writes:
 On Fri, 14 Nov 2003, Bruce Momjian wrote:
 I guess the question is whether we would fix this in a minor release,
 and I think the answer it yes, so we can fix it now.

 Ah, so we attempt to fix a bug that affects what appears to be a small %
 of configurations with quick testing and with the greater possibility of
 affecting a larger % of configurations ... instead of releasing what we
 has been reported as being stable on the large % of configurations, and
 fixing it for that small % of configuratiosn in a minor release?

Huh?  The pgstat bug is a platform dependency, sure, but this datetime
bug is not platform-specific.  I don't see that there's much commonality
in the criteria for whether to patch them.

My vote is to patch both --- I don't like shipping releases with known
bugs in them, when such bugs would have been patched with no discussion
just a week earlier.  For sure we should triple-check the proposed
patches, but once that's done I don't see a reason to hold off.

The pgstat patch has already been checked to my satisfaction, but the
datetime patch needs more eyeballs on it; anyone out there have time to
look at it?

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] [CORE] 7.4RC2 regression failur and not running stats

2003-11-14 Thread Christopher Kings-Lynne
Check that you don't need to use the -p option at all.

Also, make sure you remove any ^M (DOS CR) characters from the line 
endings.  That always happens to me if I receive the emailon a windows 
machine and save the attachment, windows sometimes likes to rewrite all 
the line endings, causing the problem below.

Chris

Glenn Wiorek wrote:

Hmm I know it's been a while  since I used patch but I seem to be having
problems applying it.  Perhaps my patch is outdated??
patch -b pgstat.c   patchfile
Looks like a new-style context diff.
Hunk#2failed at line 203.
Hunk#2failed at line 210.
Hunk#3failed at line 284.
3 out of 3 hunks ailed: saving reject to pgstat.c.rej


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


Re: [HACKERS] oh dear ...

2003-11-14 Thread Joe Conway
Marc G. Fournier wrote:
On Fri, 14 Nov 2003, Andrew Dunstan wrote:
I'm confused. My understanding from what Tom said is that it affects all
configurations.
the stats collector problem, from what I've seen through this list,
affects Solaris, and only some Solaris configuration ..
But the issue at hand is this one:

Tom Lane wrote:
 This worked in 7.3:

 regression=# select '1999-jan-08'::date;
 ERROR:  date/time field value out of range: 1999-jan-08
 HINT:  Perhaps you need a different datestyle setting.
Seems like this would affect everyone who uses this style of date in 
their app. If it isn't a must fix for 7.4, we should plan 7.4.1 for a 
fairly quick follow up.

Joe

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] oh dear ...

2003-11-14 Thread Christopher Kings-Lynne
I propose the attached patch to fix the problem.  It doesn't break any
regression tests, and it appears to fix the cases noted in its comment.
Opinions on whether to apply this to 7.4?
I think it should be fixed, since it could cause applications to break. 
 Shouldn't you also add a regression test to catch this in the future?

Chris



---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] oh dear ...

2003-11-14 Thread Tom Lane
Christopher Kings-Lynne [EMAIL PROTECTED] writes:
 Shouldn't you also add a regression test to catch this in the future?

Yes, I absolutely plan to stick some regression test additions into HEAD.
There's not a need for such changes in the 7.4 branch though.  Right at
the moment what we need is a decision about whether to apply the code
change to 7.4 release ...

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] oh dear ...

2003-11-14 Thread Joe Conway
Tom Lane wrote:
The pgstat patch has already been checked to my satisfaction, but the
datetime patch needs more eyeballs on it; anyone out there have time to
look at it?
FWIW, it looks good to me, seems to work as intended, and passes all 
existing regression tests.

Joe

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


Re: [HACKERS] oh dear ...

2003-11-14 Thread Marc G. Fournier


On Fri, 14 Nov 2003, Tom Lane wrote:

 Christopher Kings-Lynne [EMAIL PROTECTED] writes:
  Shouldn't you also add a regression test to catch this in the future?

 Yes, I absolutely plan to stick some regression test additions into HEAD.
 There's not a need for such changes in the 7.4 branch though.  Right at
 the moment what we need is a decision about whether to apply the code
 change to 7.4 release ...

Go for it ...


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] oh dear ...

2003-11-14 Thread Marc G. Fournier

My bad, confused two different issues in one thread :(

On Fri, 14 Nov 2003, Joe Conway wrote:

 Marc G. Fournier wrote:
 
  On Fri, 14 Nov 2003, Andrew Dunstan wrote:
 I'm confused. My understanding from what Tom said is that it affects all
 configurations.
 
  the stats collector problem, from what I've seen through this list,
  affects Solaris, and only some Solaris configuration ..
 

 But the issue at hand is this one:

 Tom Lane wrote:
   This worked in 7.3:
  
   regression=# select '1999-jan-08'::date;
   ERROR:  date/time field value out of range: 1999-jan-08
   HINT:  Perhaps you need a different datestyle setting.

 Seems like this would affect everyone who uses this style of date in
 their app. If it isn't a must fix for 7.4, we should plan 7.4.1 for a
 fairly quick follow up.

 Joe



---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [HACKERS] INSERT extremely slow with large data sets

2003-11-14 Thread Tom Lane
Slavisa Garic [EMAIL PROTECTED] writes:
 You didn't understand the question. Inserting ONE ROW when there are already
 5000 ROWS present takes 0.01 seconds. Inserting ONE ROW when there are
 already 6 ROWS present takes 0.09 secods.

The numbers you presented didn't really offer any strong grounds for
believing that there's an O(N) growth rate --- as far as I can see your
results are only barely out of the measurement-noise category.  Can you
run some tests where the issue is not in the least significant digit
of the available numbers?

But assuming for the moment that you've got hold of a real problem...

The actual insertion of a row should be essentially a constant-time
operation, since we just stick it into the last page of the table
(or any page with sufficient free space).  Insertion of index entries
for the row would have cost that depends on the number of existing
table entries, but for btree indexes I'd expect the cost to vary as
O(log2(N)) not O(N).  I do not think you've presented enough evidence
to prove that you're seeing linear rather than log-N cost growth.

Most of the serious insertion-cost problems we've seen lately have
to do with the costs of checking foreign key references ... but those
normally vary with the size of the referenced table, not the table into
which you're inserting.  Besides which you mentioned nothing about
foreign keys ... or any other schema details as far as I saw ...

regards, tom lane

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] cvs head? initdb?

2003-11-14 Thread Robert Treat
On Friday 14 November 2003 14:23, Neil Conway wrote:
 Jan Wieck [EMAIL PROTECTED] writes:
  Robert Treat wrote:
  people would always want to have those choices (especially for doing
  development/testing/benchmarking between the different methods) the
  question is is it worth the effort to give people those options?

 To me, the question is whether it's worth the additional complexity
 for users and administrators, and to a lesser extent the code
 complexity. (I think the answer is no)

i don't think the complexity for users is that high... depending on 
implementation.  assuming we end up with clearly definable cases where one is 
more usefull than the other. (which seems just as likely not to happen as to 
happen)

  And in the case of the cache strategy, the point is that different
  access patterns might be served better by different strategies.

 Granted -- but IMHO it would be better to concentrate on making sure
 that ARC adapts to any access pattern so that the set of access
 patterns where you _really want_ LRU is a small as possible, if not
 empty.


but how do you test this if you cant run them both against each other to 
compare?  (initally running vs 7.4 does tell you something, but even now, 7.5 
improved cross datatype index improvments could skew the results of any 
comparisons)

Robert Treat
-- 
Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] cvs head? initdb?

2003-11-14 Thread Bruce Momjian

Our philosophy has never been to give people configuration options just
in case they might be valuable to them.  If we did that, we would be
like Oracle.

We give config options only if we can't decide the best default.  For
testing, you can have an #ifdef and we can test it ourselves.  If we can
find the best default, no need to burden the user with it.

---

Robert Treat wrote:
 On Friday 14 November 2003 14:23, Neil Conway wrote:
  Jan Wieck [EMAIL PROTECTED] writes:
   Robert Treat wrote:
   people would always want to have those choices (especially for doing
   development/testing/benchmarking between the different methods) the
   question is is it worth the effort to give people those options?
 
  To me, the question is whether it's worth the additional complexity
  for users and administrators, and to a lesser extent the code
  complexity. (I think the answer is no)
 
 i don't think the complexity for users is that high... depending on 
 implementation.  assuming we end up with clearly definable cases where one is 
 more usefull than the other. (which seems just as likely not to happen as to 
 happen)
 
   And in the case of the cache strategy, the point is that different
   access patterns might be served better by different strategies.
 
  Granted -- but IMHO it would be better to concentrate on making sure
  that ARC adapts to any access pattern so that the set of access
  patterns where you _really want_ LRU is a small as possible, if not
  empty.
 
 
 but how do you test this if you cant run them both against each other to 
 compare?  (initally running vs 7.4 does tell you something, but even now, 7.5 
 improved cross datatype index improvments could skew the results of any 
 comparisons)
 
 Robert Treat
 -- 
 Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL
 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] heads up -- subtle change of behavior of new initdb

2003-11-14 Thread Andrew Dunstan
darnit!

patch attached.

(Thinks - do we need to worry about suid sgid and sticky bits on data dir?)

andrew

Tom Lane wrote:

Joe Conway [EMAIL PROTECTED] writes:
 

I just noticed tonight that the new initdb introduced a subtle change of 
behavior. I use a shell script to automate the process of
- rm old data directory
- mkdir new data directory
- initdb
- load from pgdumpall
Now, that second step is not needed, but as of today it produces an 
installation that won't start due to improper permissions on data
   

That's a bug --- evidently the fix permissions path of control is
wrong; can you take a look?
 

? .deps
? initdb
Index: initdb.c
===
RCS file: /projects/cvsroot/pgsql-server/src/bin/initdb/initdb.c,v
retrieving revision 1.7
diff -c -w -r1.7 initdb.c
*** initdb.c13 Nov 2003 23:46:31 -  1.7
--- initdb.c14 Nov 2003 06:47:50 -
***
*** 2345,2350 
--- 2345,2359 
  
made_new_pgdata = true;
}
+   else
+   {
+   printf(fixing permissions on existing directory %s... ,pg_data);
+   fflush(stdout);
+   if (!chmod(pg_data,0700))
+   exit_nicely();
+   else
+   check_ok();
+   }
  
/* Create required subdirectories */
  

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PATCHES] [HACKERS] heads up -- subtle change of behavior of new initdb

2003-11-14 Thread Tom Lane
Andrew Dunstan [EMAIL PROTECTED] writes:
 darnit!
 patch attached.

Applied with correction (you got the return-value check backwards)
and further work to deal reasonably with error conditions occurring
in check_data_dir.

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] heads up -- subtle change of behavior of new initdb

2003-11-14 Thread Bruce Momjian

Patch applied.  Thanks.

---


Andrew Dunstan wrote:
 
 darnit!
 
 patch attached.
 
 (Thinks - do we need to worry about suid sgid and sticky bits on data dir?)
 
 andrew
 
 Tom Lane wrote:
 
 Joe Conway [EMAIL PROTECTED] writes:
   
 
 I just noticed tonight that the new initdb introduced a subtle change of 
 behavior. I use a shell script to automate the process of
 - rm old data directory
 - mkdir new data directory
 - initdb
 - load from pgdumpall
 Now, that second step is not needed, but as of today it produces an 
 installation that won't start due to improper permissions on data
 
 
 
 That's a bug --- evidently the fix permissions path of control is
 wrong; can you take a look?
 
 
   
 

 ? .deps
 ? initdb
 Index: initdb.c
 ===
 RCS file: /projects/cvsroot/pgsql-server/src/bin/initdb/initdb.c,v
 retrieving revision 1.7
 diff -c -w -r1.7 initdb.c
 *** initdb.c  13 Nov 2003 23:46:31 -  1.7
 --- initdb.c  14 Nov 2003 06:47:50 -
 ***
 *** 2345,2350 
 --- 2345,2359 
   
   made_new_pgdata = true;
   }
 + else
 + {
 + printf(fixing permissions on existing directory %s... ,pg_data);
 + fflush(stdout);
 + if (!chmod(pg_data,0700))
 + exit_nicely();
 + else
 + check_ok();
 + }
   
   /* Create required subdirectories */
   

 
 ---(end of broadcast)---
 TIP 4: Don't 'kill -9' the postmaster

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PATCHES] [HACKERS] heads up -- subtle change of behavior of new

2003-11-14 Thread Bruce Momjian
Tom Lane wrote:
 Andrew Dunstan [EMAIL PROTECTED] writes:
  darnit!
  patch attached.
 
 Applied with correction (you got the return-value check backwards)
 and further work to deal reasonably with error conditions occurring
 in check_data_dir.

Tom applied it before I could.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PATCHES] [HACKERS] heads up -- subtle change of behavior of

2003-11-14 Thread Andrew Dunstan
Tom Lane wrote:

Andrew Dunstan [EMAIL PROTECTED] writes:
 

darnit!
patch attached.
   

Applied with correction (you got the return-value check backwards)
and further work to deal reasonably with error conditions occurring
in check_data_dir.
darnit again.

I'm taking a break - my head is swimming with Java, JavaScript, Perl, 
HTML and XML/XSL from my real (i.e. paying) work, and context switching 
is causing massive mental thrashing.

cheers

andrew

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster