date:20010505

[HACKERS] GiST indexing problems...

2001-05-05 Thread David McWherter



 I've been doing some research work using the GiST indexes,
but I persistently develop a problem where the system doesn't 
make use of the indexes during the execution of a query.  If
I use the examples provided here:

  http://wit.mcs.anl.gov/~selkovjr/pg_extensions/

For instance, and I place an elog( DEBUG, functionname )
in each of the GiST accessor functions, I can witness when
the database is making use of the index.  During the construction
of the index, I never have a problem, although during query
execution, it seems that my indices aren't getting used at
all, and the database is simply searching through all of
the entries in the database.

This is a terribly frustrating problem that I encountered
once before, but which mysteriously went away after fiddling
with the problem for a while.  This time, the problem isn't
going away, however.  When I trace through the postgres 
application I can see that it at least examines the opclass
for my specialized data types, and detects that there exists
an index that could be used, but it seems to decide not to
make use of it regardless.

Is there an easy way that I can force the use of an index
during a query?

-David

--[=]
David T. McWherter[EMAIL PROTECTED]

   vdiff
=
 /vee'dif/ v.,n.  Visual diff.  The operation offinding
differences between two files by {eyeball search}.  Theterm
`optical diff' has also been reported, and is sometimes more
specifically used for the act of superimposing two nearly identical
printouts on one another and holding them up to a light to spot
differences.  Though this method is poor for detecting omissions in
the `rear' file, it can also be used with printouts of graphics, a
claim few if any diff programs can make.  See {diff}.

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

[HACKERS] elog(LOG), elog(DEBUG)

2001-05-05 Thread Peter Eisentraut


There's a TODO item to make elog(LOG) a separate level.  I propose the
name INFO.  It would be identical to DEBUG in effect, only with a
different label.  Additionally, all DEBUG logging should either be
disabled unless the debug_level is greater than zero, or alternatively
some elog(DEBUG) calls should be converted to INFO conditional on a
configuration setting (like log_pid, for example).

The stricter distinction between DEBUG and INFO would also yield the
possibility of optionally sending DEBUG output to the frontend, as has
been requested a few times.

-- 
Peter Eisentraut   [EMAIL PROTECTED]   http://funkturm.homeip.net/~peter


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly

Re: [HACKERS] GiST indexing problems...

2001-05-05 Thread Oleg Bartunov


David,

could you provide more info (scheme, query, postgresql version)

Regards,

Oleg
On Sat, 5 May 2001, David McWherter wrote:


  I've been doing some research work using the GiST indexes,
 but I persistently develop a problem where the system doesn't
 make use of the indexes during the execution of a query.  If
 I use the examples provided here:

   http://wit.mcs.anl.gov/~selkovjr/pg_extensions/

 For instance, and I place an elog( DEBUG, functionname )
 in each of the GiST accessor functions, I can witness when
 the database is making use of the index.  During the construction
 of the index, I never have a problem, although during query
 execution, it seems that my indices aren't getting used at
 all, and the database is simply searching through all of
 the entries in the database.

 This is a terribly frustrating problem that I encountered
 once before, but which mysteriously went away after fiddling
 with the problem for a while.  This time, the problem isn't
 going away, however.  When I trace through the postgres
 application I can see that it at least examines the opclass
 for my specialized data types, and detects that there exists
 an index that could be used, but it seems to decide not to
 make use of it regardless.

 Is there an easy way that I can force the use of an index
 during a query?

 -David

 --[=]
 David T. McWherter[EMAIL PROTECTED]

vdiff
 =
  /vee'dif/ v.,n.  Visual diff.  The operation offinding
 differences between two files by {eyeball search}.  Theterm
 `optical diff' has also been reported, and is sometimes more
 specifically used for the act of superimposing two nearly identical
 printouts on one another and holding them up to a light to spot
 differences.  Though this method is poor for detecting omissions in
 the `rear' file, it can also be used with printouts of graphics, a
 claim few if any diff programs can make.  See {diff}.

 ---(end of broadcast)---
 TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Regards,
Oleg
_
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])

Re: [HACKERS] GiST indexing problems...

2001-05-05 Thread David McWherter



Sure.  My postgresql version is 7.0.2. 

My database has a datatype called graph that looks like this:

CREATE TYPE graph (
internallength = VARIABLE,
input = graph_in,
output = graph_out
);
CREATE OPERATOR ~ ( 
leftarg = graph,
rightarg = graph,
procedure = graph_distance,
commutator = ~
);

And it has a datatype 'graphrange':

CREATE FUNCTION graph_inrange(graph, graphrange)
RETURNS bool
AS '/usr/remote/home_u/udmcwher/myprojs/pg_graph.2/graph.so'
language 'c';

CREATE TYPE graphrange (
internallength = VARIABLE,
input = graphrange_in,
output = graphrange_out
);
CREATE OPERATOR  (
leftarg = graph,
rightarg = graphrange,
procedure = graph_inrange
);

I have a bunch of GiST operators that are created like this:
  CREATE FUNCTION gist_graph_consistent(opaque,graphrange) 
RETURNS bool
AS '/usr/remote/home_u/udmcwher/myprojs/pg_graph.2/graph.so'
language 'c';
  /* the same for gist_graph_{compress,decompress,penalty,picksplit,union,same} */
   


I've tried adding the parameters 'restrict = eqsel' and 'join = eqjoinsel'
to the datatype operators, but that doesn't seem to change anything.


I construct a new opclass like this:

INSERT INTO pg_opclass (opcname,opcdeftype)
values ( 'gist_graphrange_ops' );

SELECT o.oid AS opoid, o.oprname
INTO TABLE graph_ops_tmp
FROM pg_operator o, pg_type t
WHERE o.oprleft = t.oid 
   and t.typname = 'graph';
INSERT INTO pg_amop (amopid, amopclaid, amopopr, amopstrategy)
SELECT am.oid, opcl.oid, c.opoid, 1
FROM pg_am am, pg_opclass opcl, graph_ops_tmp c
WHERE amname = 'gist' and opcname = 'gist_graphrange_ops'
and c.oprname = '';



INSERT INTO pg_amproc (amid, amopclaid, amproc, amprocnum)
   SELECT am.oid, opcl.oid, pro.oid, 1
   FROM pg_am am, pg_opclass opcl, pg_proc pro
   WHERE  amname = 'gist' and opcname = 'gist_graphrange_ops'
  and proname = 'gist_graph_consistent';

INSERT INTO pg_amproc (amid, amopclaid, amproc, amprocnum)
   SELECT am.oid, opcl.oid, pro.oid, 2
   FROM pg_am am, pg_opclass opcl, pg_proc pro
   WHERE  amname = 'gist' and opcname = 'gist_graphrange_ops'
  and proname = 'gist_graph_union';

INSERT INTO pg_amproc (amid, amopclaid, amproc, amprocnum)
   SELECT am.oid, opcl.oid, pro.oid, 3
   FROM pg_am am, pg_opclass opcl, pg_proc pro
   WHERE  amname = 'gist' and opcname = 'gist_graphrange_ops'
  and proname = 'gist_graph_compress';

INSERT INTO pg_amproc (amid, amopclaid, amproc, amprocnum)
   SELECT am.oid, opcl.oid, pro.oid, 4
   FROM pg_am am, pg_opclass opcl, pg_proc pro
   WHERE  amname = 'gist' and opcname = 'gist_graphrange_ops'
  and proname = 'gist_graph_decompress';

INSERT INTO pg_amproc (amid, amopclaid, amproc, amprocnum)
   SELECT am.oid, opcl.oid, pro.oid, 5
   FROM pg_am am, pg_opclass opcl, pg_proc pro
   WHERE  amname = 'gist' and opcname = 'gist_graphrange_ops'
  and proname = 'gist_graph_penalty';

INSERT INTO pg_amproc (amid, amopclaid, amproc, amprocnum)
   SELECT am.oid, opcl.oid, pro.oid, 6
   FROM pg_am am, pg_opclass opcl, pg_proc pro
   WHERE  amname = 'gist' and opcname = 'gist_graphrange_ops'
  and proname = 'gist_graph_picksplit';

INSERT INTO pg_amproc (amid, amopclaid, amproc, amprocnum)
   SELECT am.oid, opcl.oid, pro.oid, 7
   FROM pg_am am, pg_opclass opcl, pg_proc pro
   WHERE  amname = 'gist' and opcname = 'gist_graphrange_ops'
  and proname = 'gist_graphrange_same';

I construct a table like this:
  
CREATE TABLE repos ( a graph, file varchar(512) );
INSERT INTO repos VALUES ( import_graphfile('/tmp/test1'), '/tmp/test1' );
INSERT INTO repos VALUES ( import_graphfile('/tmp/test2'), '/tmp/test2' );

What this does is a little bit weird, it reads in the test1 and test2 datafiles
into the database, storing them as large objects.  Then, it constructs
graph objects which have their oid's, and returns them from import_graphfile.

I then try to construct an index like this:

CREATE INDEX repos_index ON repos 
 USING gist ( a gist_graphrange_ops ) ;

I've also tried a:graph and a:graphrange, but I don't think it changes anything.

My queries look like:

   SELECT * from repos where a  '(oid-num,int-num)'::graphrange;

The function operator returns a boolean if a particular relation holds between
the graph object and the graphrange object.

The GiST compress operator will convert leaf GRAPH keys into 
graphrange keys for internal use.  Each of my GiST operators
call elog( DEBUG, function-name ) as they're called.  When
constructing the index, compress,decompress,picksplit,union
are called as expected.  During the execution of the query,
however, nothing happens.

I've found the same exact results using the 'pggist' examples
(a suite including intproc,boxproc,polyproc,textproc), 
and the examples found here: 
http://wit.mcs.anl.gov/~selkovjr/pg_extensions/contrib-7.0.tgz.
The

[HACKERS] Lisp as procedural language

2001-05-05 Thread Vladimir V. Zolotych


Hello

I see the following 

proba= select * from pg_language;
lanname |lanispl|lanpltrusted|lanplcallfoid|lancompiler   
+---++-+--
internal|f  |f   |0|n/a   
lisp|f  |f   |0|/usr/ucb/liszt
C   |f  |f   |0|/bin/cc   
sql |f  |f   |0|postgres  
plpgsql |t  |t   | 56702850|PL/pgSQL  

Would you mind to tell me is it possible to use Lisp
as procedural language ? Which Lisp (e.g Emacs-list,
Common Lisp, etc.). If it is possible could you give
me hints how I can do that ?

I'm using PosgtreSQL 7.0, Slackware 7.0, also I have
Common Lisp (CMUCL 18c) installed.

-- 
Vladimir Zolotych [EMAIL PROTECTED]

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] GiST indexing problems...

2001-05-05 Thread Oleg Bartunov


David,

GiST prior 7.1 was broken in several respects. Please,
try 7.1 and examples from contrib/intarray. It should works.
btw, you'll have compress function actually works.

Regards,

Oleg
On Sat, 5 May 2001, David McWherter wrote:


 Sure.  My postgresql version is 7.0.2.

 My database has a datatype called graph that looks like this:

 CREATE TYPE graph (
 internallength = VARIABLE,
 input = graph_in,
 output = graph_out
 );
 CREATE OPERATOR ~ (
 leftarg = graph,
 rightarg = graph,
 procedure = graph_distance,
 commutator = ~
 );

 And it has a datatype 'graphrange':

 CREATE FUNCTION graph_inrange(graph, graphrange)
 RETURNS bool
 AS '/usr/remote/home_u/udmcwher/myprojs/pg_graph.2/graph.so'
 language 'c';

 CREATE TYPE graphrange (
 internallength = VARIABLE,
 input = graphrange_in,
 output = graphrange_out
 );
 CREATE OPERATOR  (
 leftarg = graph,
 rightarg = graphrange,
 procedure = graph_inrange
 );

 I have a bunch of GiST operators that are created like this:
   CREATE FUNCTION gist_graph_consistent(opaque,graphrange)
 RETURNS bool
 AS '/usr/remote/home_u/udmcwher/myprojs/pg_graph.2/graph.so'
 language 'c';
   /* the same for gist_graph_{compress,decompress,penalty,picksplit,union,same} */



 I've tried adding the parameters 'restrict = eqsel' and 'join = eqjoinsel'
 to the datatype operators, but that doesn't seem to change anything.


 I construct a new opclass like this:

 INSERT INTO pg_opclass (opcname,opcdeftype)
 values ( 'gist_graphrange_ops' );

 SELECT o.oid AS opoid, o.oprname
 INTO TABLE graph_ops_tmp
 FROM pg_operator o, pg_type t
 WHERE o.oprleft = t.oid
and t.typname = 'graph';
 INSERT INTO pg_amop (amopid, amopclaid, amopopr, amopstrategy)
 SELECT am.oid, opcl.oid, c.opoid, 1
 FROM pg_am am, pg_opclass opcl, graph_ops_tmp c
 WHERE amname = 'gist' and opcname = 'gist_graphrange_ops'
 and c.oprname = '';



 INSERT INTO pg_amproc (amid, amopclaid, amproc, amprocnum)
SELECT am.oid, opcl.oid, pro.oid, 1
FROM pg_am am, pg_opclass opcl, pg_proc pro
WHERE  amname = 'gist' and opcname = 'gist_graphrange_ops'
   and proname = 'gist_graph_consistent';

 INSERT INTO pg_amproc (amid, amopclaid, amproc, amprocnum)
SELECT am.oid, opcl.oid, pro.oid, 2
FROM pg_am am, pg_opclass opcl, pg_proc pro
WHERE  amname = 'gist' and opcname = 'gist_graphrange_ops'
   and proname = 'gist_graph_union';

 INSERT INTO pg_amproc (amid, amopclaid, amproc, amprocnum)
SELECT am.oid, opcl.oid, pro.oid, 3
FROM pg_am am, pg_opclass opcl, pg_proc pro
WHERE  amname = 'gist' and opcname = 'gist_graphrange_ops'
   and proname = 'gist_graph_compress';

 INSERT INTO pg_amproc (amid, amopclaid, amproc, amprocnum)
SELECT am.oid, opcl.oid, pro.oid, 4
FROM pg_am am, pg_opclass opcl, pg_proc pro
WHERE  amname = 'gist' and opcname = 'gist_graphrange_ops'
   and proname = 'gist_graph_decompress';

 INSERT INTO pg_amproc (amid, amopclaid, amproc, amprocnum)
SELECT am.oid, opcl.oid, pro.oid, 5
FROM pg_am am, pg_opclass opcl, pg_proc pro
WHERE  amname = 'gist' and opcname = 'gist_graphrange_ops'
   and proname = 'gist_graph_penalty';

 INSERT INTO pg_amproc (amid, amopclaid, amproc, amprocnum)
SELECT am.oid, opcl.oid, pro.oid, 6
FROM pg_am am, pg_opclass opcl, pg_proc pro
WHERE  amname = 'gist' and opcname = 'gist_graphrange_ops'
   and proname = 'gist_graph_picksplit';

 INSERT INTO pg_amproc (amid, amopclaid, amproc, amprocnum)
SELECT am.oid, opcl.oid, pro.oid, 7
FROM pg_am am, pg_opclass opcl, pg_proc pro
WHERE  amname = 'gist' and opcname = 'gist_graphrange_ops'
   and proname = 'gist_graphrange_same';

 I construct a table like this:

 CREATE TABLE repos ( a graph, file varchar(512) );
 INSERT INTO repos VALUES ( import_graphfile('/tmp/test1'), '/tmp/test1' );
 INSERT INTO repos VALUES ( import_graphfile('/tmp/test2'), '/tmp/test2' );

 What this does is a little bit weird, it reads in the test1 and test2 datafiles
 into the database, storing them as large objects.  Then, it constructs
 graph objects which have their oid's, and returns them from import_graphfile.

 I then try to construct an index like this:

 CREATE INDEX repos_index ON repos
  USING gist ( a gist_graphrange_ops ) ;

 I've also tried a:graph and a:graphrange, but I don't think it changes anything.

 My queries look like:

SELECT * from repos where a  '(oid-num,int-num)'::graphrange;

 The function operator returns a boolean if a particular relation holds between
 the graph object and the graphrange object.

 The GiST compress operator will convert leaf GRAPH keys into
 graphrange keys for internal use.  Each of my GiST operators
 call elog( DEBUG, function-name ) as they're called.  When
 constructing the

Re: [HACKERS] elog(LOG), elog(DEBUG)

2001-05-05 Thread Tom Lane


Peter Eisentraut [EMAIL PROTECTED] writes:
 There's a TODO item to make elog(LOG) a separate level.  I propose the
 name INFO.  It would be identical to DEBUG in effect, only with a
 different label.

This conveys nothing to my mind.  How should I determine whether a given
elog call ought to use INFO or DEBUG?

 The stricter distinction between DEBUG and INFO would also yield the
 possibility of optionally sending DEBUG output to the frontend, as has
 been requested a few times.

It's not a strict distinction unless we have a clear policy as to what
the different levels mean.  I think setting and documenting that policy
is the hard part of the task.

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] GiST indexing problems...

2001-05-05 Thread Tom Lane


David McWherter [EMAIL PROTECTED] writes:
 I've tried adding the parameters 'restrict = eqsel' and 'join = eqjoinsel'
 to the datatype operators, but that doesn't seem to change anything.

You might have better luck if you use area-related selectivity
estimators.  Your problem seems to be that the optimizer doesn't
think the index is worth using, and the cause almost certainly is
overly pessimistic selectivity estimates for the indexable operators.
areasel and friends are completely bogus, but at least they deliver
small enough numbers to encourage use of the index ;-)

As Oleg says, the GiST support in 7.0.* is in pretty poor shape
(it had been suffering from neglect for a long time).  Try 7.1.

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] GiST indexing problems...

2001-05-05 Thread David McWherter



So, I've migrated my code to do the TOAST'ing thing required of 7.1
clients, and I've updated my operator to use the areaselectors:
 CREATE OPERATOR = (
 leftarg = graph,
 rightarg = graphrange,
 procedure = graph_inrange,
 commutator = '=',
 restrict = areasel,
 join = areajoinsel
 );

But I still get the issue that my queries don't seem to trigger the
GiST indexes to be used.  Perhaps the problem is that the system
just thinks that the query doesn't need an index to increase 
performance, i've only got about a dozen elements in the database
right now for testing purposes.

-David

Tom Lane writes:
  David McWherter [EMAIL PROTECTED] writes:
   I've tried adding the parameters 'restrict = eqsel' and 'join = eqjoinsel'
   to the datatype operators, but that doesn't seem to change anything.
  
  You might have better luck if you use area-related selectivity
  estimators.  Your problem seems to be that the optimizer doesn't
  think the index is worth using, and the cause almost certainly is
  overly pessimistic selectivity estimates for the indexable operators.
  areasel and friends are completely bogus, but at least they deliver
  small enough numbers to encourage use of the index ;-)
  
  As Oleg says, the GiST support in 7.0.* is in pretty poor shape
  (it had been suffering from neglect for a long time).  Try 7.1.
  
   regards, tom lane

--[=]
David T. McWherter[EMAIL PROTECTED]

The truth is rarely pure, and never simple.
-- Oscar Wilde

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://www.postgresql.org/search.mpl

[HACKERS] typo in psql's help

2001-05-05 Thread Vince Vielhaber



postgresql= \h create table
Command: CREATE TABLE
Description: Creates a new table
Syntax:
CREATE [ TEMPORARY | TEMP ] TABLE table (
column type
[ NULL | NOT NULL ] [ UNIQUE ] [ DEFAULT value ]
[column_constraint_clause | PRIMARY KEY } [ ... ] ]
^
   This should be a ]   |

Vince.
-- 
==
Vince Vielhaber -- KA8CSHemail: [EMAIL PROTECTED]http://www.pop4.net
 56K Nationwide Dialup from $16.00/mo at Pop4 Networking
Online Campground Directoryhttp://www.camping-usa.com
   Online Giftshop Superstorehttp://www.cloudninegifts.com
==




---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

Re: [HACKERS] GiST indexing problems...

2001-05-05 Thread David McWherter



Beautiful!  That fixed my problem.  One thing that might be
useful is to update the Index method-extension documentation
on the web site to reflect this problem a bit...if somebody
just wants to get a working index, it can be a bit misleading.
I'll probably go and see if I can construct a few words on
it after my current workload subsides a bit.

-David

Tom Lane writes:
  David McWherter [EMAIL PROTECTED] writes:
   But I still get the issue that my queries don't seem to trigger the
   GiST indexes to be used.  Perhaps the problem is that the system
   just thinks that the query doesn't need an index to increase 
   performance, i've only got about a dozen elements in the database
   right now for testing purposes.
  
  Ah, so.  You're right, you need more data.
  
  You could try
   SET ENABLE_SEQSCAN TO OFF
  if you just want to force use of the index for testing purposes.
  
   regards, tom lane

--[=]
David T. McWherter[EMAIL PROTECTED]

Never pay a compliment as if expecting a receipt.

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

[HACKERS] Re: New Linux xfs/reiser file systems

2001-05-05 Thread Lincoln Yeoh


At 02:09 AM 5/4/01 -0500, Thomas Swan wrote:
 I think it's worth noting that Oracle has been petitioning the
 kernel developers for better raw device support: in other words,
 the ability to write directly to the hard disk and bypassing the
 filesystem all together.   

But there could be other reasons why Oracle would want to do raw stuff.

1) They have more things to sell - management modules/software. More
training courses. Certified blahblahblah. More features in brochure.
2) It just helps make things more proprietary. Think lock in.

All that for maybe 10% performance increase?

I think it's more advantageous for Postgresql to keep the filesystem layer
of abstraction, than to do away with it, and later reinvent certain parts
of it along with new bugs.

What would be useful is if one can specify where the tables, indexes, WAL
and other files go. That feature would probably help improve performance
far more. 

For example: you could then stick the WAL on a battery backed up RAM disk.
How much total space does a WAL log need?

A battery backed RAM disk might even be cheaper than Brand X RDBMS
Proprietary Feature #5.

Cheerio,
Link.


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

[HACKERS] File system performance and pg_xlog

2001-05-05 Thread mlw


A small debate started with bad performance on ReiserFS. I pondered the likely
advantages to raw device access. It also occured to me that the FAT file system
is about as close to a managed raw device as one could get. So I did some
tests:
The hardware:

A PII system running Linux 7.0, with 2.2.16-2.
256M RAM
IDE home hard disk.
Adaptec 2740 with two SCSI drives
A 9G Seagate ST19171W as /dev/sda1 mounted as /sda1
A 4G Seagate ST15150W as /dev/sdb1 mounted as /sdb1
/sda1 has a ext2 file system, and is used as base with a symlink.
/sdb1 is either an ext2 or FAT file system used as pg_xlog with a symlink.


In a clean Postgres environment, I initialized pgbench as:
./pgbench -i -s 10 -d pgbench

I used this script to produce the results:

psql -U mohawk pgbench -c checkpoint; 
su mohawk -c ./pgbench -d pgbench -t 32 -c 1
psql -U mohawk pgbench  -c checkpoint; 
su mohawk -c ./pgbench -d pgbench -t 32 -c 2
psql -U mohawk pgbench  -c checkpoint; 
su mohawk -c ./pgbench -d pgbench -t 32 -c 3
psql -U mohawk pgbench  -c checkpoint; 
su mohawk -c ./pgbench -d pgbench -t 32 -c 4
psql -U mohawk pgbench  -c checkpoint; 
su mohawk -c ./pgbench -d pgbench -t 32 -c 5
psql -U mohawk pgbench  -c checkpoint; 
su mohawk -c ./pgbench -d pgbench -t 32 -c 6
psql -U mohawk pgbench  -c checkpoint; 
su mohawk -c ./pgbench -d pgbench -t 32 -c 7
psql -U mohawk pgbench  -c checkpoint; 
su mohawk -c ./pgbench -d pgbench -t 32 -c 8

(My postgres user is mohawk)

I had to modify xlog.c to use rename instead of link. And I had to explicitly
set ownership of the FAT file system to the postgres user during mount.

I ran the script twice as:

./test.sh  ext2.log

(Then rebuilt a fresh database and formatted sdb1 as fat)
./test.sh  fat.log

Here is a diff of the two runs:

--- ext2.logSat May  5 12:58:07 2001
+++ fat.log Sat May  5 12:58:07 2001
@@ -5,8 +5,8 @@
 number of clients: 1
 number of transactions per client: 32
 number of transactions actually processed: 32/32
-tps = 18.697006(including connections establishing)
-tps = 19.193225(excluding connections establishing)
+tps = 37.439512(including connections establishing)
+tps = 39.710461(excluding connections establishing)
 CHECKPOINT
 pghost: (null) pgport: (null) nclients: 2 nxacts: 32 dbName: pgbench
 transaction type: TPC-B (sort of)
@@ -14,8 +14,8 @@
 number of clients: 2
 number of transactions per client: 32
 number of transactions actually processed: 64/64
-tps = 32.444226(including connections establishing)
-tps = 33.499452(excluding connections establishing)
+tps = 44.782177(including connections establishing)
+tps = 46.799328(excluding connections establishing)
 CHECKPOINT
 pghost: (null) pgport: (null) nclients: 3 nxacts: 32 dbName: pgbench
 transaction type: TPC-B (sort of)
@@ -23,8 +23,8 @@
 number of clients: 3
 number of transactions per client: 32
 number of transactions actually processed: 96/96
-tps = 43.042861(including connections establishing)
-tps = 44.816086(excluding connections establishing)
+tps = 55.416117(including connections establishing)
+tps = 58.057013(excluding connections establishing)
 CHECKPOINT
 pghost: (null) pgport: (null) nclients: 4 nxacts: 32 dbName: pgbench
 transaction type: TPC-B (sort of)
@@ -32,8 +32,8 @@
 number of clients: 4
 number of transactions per client: 32
 number of transactions actually processed: 128/128
-tps = 46.033959(including connections establishing)
-tps = 47.681683(excluding connections establishing)
+tps = 61.752368(including connections establishing)
+tps = 64.796970(excluding connections establishing)
 CHECKPOINT
 pghost: (null) pgport: (null) nclients: 5 nxacts: 32 dbName: pgbench
 transaction type: TPC-B (sort of)
@@ -41,8 +41,8 @@
 number of clients: 5
 number of transactions per client: 32
 number of transactions actually processed: 160/160
-tps = 49.980258(including connections establishing)
-tps = 51.874653(excluding connections establishing)
+tps = 63.124090(including connections establishing)
+tps = 67.225563(excluding connections establishing)
 CHECKPOINT
 pghost: (null) pgport: (null) nclients: 6 nxacts: 32 dbName: pgbench
 transaction type: TPC-B (sort of)
@@ -50,8 +50,8 @@
 number of clients: 6
 number of transactions per client: 32
 number of transactions actually processed: 192/192
-tps = 51.800192(including connections establishing)
-tps = 53.752739(excluding connections establishing)
+tps = 65.452545(including connections establishing)
+tps = 68.741933(excluding connections establishing)
 CHECKPOINT
 pghost: (null) pgport: (null) nclients: 7 nxacts: 32 dbName: pgbench
 transaction type: TPC-B (sort of)
@@ -59,8 +59,8 @@
 number of clients: 7
 number of transactions per client: 32
 number of transactions actually processed: 224/224
-tps = 52.652660(including connections establishing)
-tps = 54.616802(excluding connections establishing)
+tps = 66.525419(including connections establishing)
+tps = 69.727409(excluding connections establishing)
 CHECKPOINT
 pghost: (null) pgport: (null) nclients:

[HACKERS] Re: New Linux xfs/reiser file systems

2001-05-05 Thread mlw


Lincoln Yeoh wrote:
 
 At 02:09 AM 5/4/01 -0500, Thomas Swan wrote:
  I think it's worth noting that Oracle has been petitioning the
  kernel developers for better raw device support: in other words,
  the ability to write directly to the hard disk and bypassing the
  filesystem all together.
 
 But there could be other reasons why Oracle would want to do raw stuff.
 
 1) They have more things to sell - management modules/software. More
 training courses. Certified blahblahblah. More features in brochure.
 2) It just helps make things more proprietary. Think lock in.
 
 All that for maybe 10% performance increase?
 
 I think it's more advantageous for Postgresql to keep the filesystem layer
 of abstraction, than to do away with it, and later reinvent certain parts
 of it along with new bugs.

I just did a test of putting pg_xlog on a FAT file system, and my first rough
tests (pgbench) show an approximate 20% performance increase over ext2 with
fsync enabled.


-- 
I'm not offering myself as an example; every life evolves by its own laws.

http://www.mohawksoft.com

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://www.postgresql.org/search.mpl

Re: [HACKERS] New Linux xfs/reiser file systems

2001-05-05 Thread thomas graichen


Bruce Momjian [EMAIL PROTECTED] wrote:
  Yes, this double-writing is a problem.  Suppose you have your WAL on a
  separate drive.  You can fsync() WAL with zero head movement.  With a
  log based file system, you need two head movements, so you have gone
  from zero movements to two.
 
 It may be worse depending on how the filesystem actually does
 journalling.  I wonder if an fsync() may cause ALL pending
 meta-data to be updated (even metadata not related to the 
 postgresql files).
 
 Do you know if reiser or xfs have this problem?

 I don't know, but the Linux user reported xfs was really slow.

i think this should be tested in more detail: i once tried this
lightly (running pgbench against postgresql 7.1beta4) with
different filesystems: ext2, reiserfs and XFS and reproducable
i got about 15% better results running on XFS ... ok - it's
not a very big test, but i think it might be worth to really
do an a/b test before seing it as a fact that postgresql is
slow on XFS (and maybe reiserfs too ... but reiserfs has had
performance problems in certain situations anyway)

XFS is a journaling fs, but it does all it's work in a very
clever way (delayed allocation etc.) - so usually you should
under normal conditions get decent performance out of it -
otherwise it might be worth sending a mail to the XFS
mailinglist (resierfs maybe dito)

t

-- 
thomas graichen [EMAIL PROTECTED] ... perfection is reached, not
when there is no longer anything to add, but when there is no
longer anything to take away. --- antoine de saint-exupery

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://www.postgresql.org/search.mpl

Re: [HACKERS] File system performance and pg_xlog

2001-05-05 Thread Marko Kreen


On Sat, May 05, 2001 at 01:09:38PM -0400, mlw wrote:
 A small debate started with bad performance on ReiserFS. I pondered the likely
 advantages to raw device access. It also occured to me that the FAT file system
 is about as close to a managed raw device as one could get. So I did some
 tests:

 /sdb1 is either an ext2 or FAT file system used as pg_xlog with a symlink.

One little thought: does mounting ext2 with 'noatime' makes any
difference?  AFAIK fat does not have concept of atime, so then
it would be more fair?  Just a thought.

-- 
marko


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] Lisp as procedural language

2001-05-05 Thread Marko Kreen


On Sat, May 05, 2001 at 04:54:07PM +0300, Vladimir V. Zolotych wrote:
 I see the following 
 
 proba= select * from pg_language;

 lisp|f  |f   |0|/usr/ucb/liszt

 Would you mind to tell me is it possible to use Lisp
 as procedural language ? Which Lisp (e.g Emacs-list,
 Common Lisp, etc.). If it is possible could you give
 me hints how I can do that ?

Huh?  Seems like you already have using lisp?  Ask your
sysadmin where did he got it?  And meybe you/he could
post it to PostgreSQL lists too?

Or did you simply inserted a new row into pg_language?
Well, that's not the way it works.  There needs to be a glue
layer between PostgreSQL and a language.  You should study
code in pgsql/src/pl/{plperl,tcl} for how it is
implementer for Perl and Tcl.  There is also plpgsql which
is stand-alone module.

 I'm using PosgtreSQL 7.0, Slackware 7.0, also I have
 Common Lisp (CMUCL 18c) installed.

Ok, but you need a little bit more for that...

-- 
marko


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] elog(LOG), elog(DEBUG)

2001-05-05 Thread Peter Eisentraut


Tom Lane writes:

 Peter Eisentraut [EMAIL PROTECTED] writes:
  There's a TODO item to make elog(LOG) a separate level.  I propose the
  name INFO.  It would be identical to DEBUG in effect, only with a
  different label.

 This conveys nothing to my mind.  How should I determine whether a given
 elog call ought to use INFO or DEBUG?

DEBUG is for messages intended to help locating and analyzing faults in
the source code (i.e., debugging).  Normal users don't need this during
normal operation.

INFO (or whatever the name) is for messages that administrator's might be
interested in for auditing and tuning.

Example:

elog(DEBUG, heapgettup(..., b=0x%x, nkeys=%d, key=0x%x, buffer, nkeys, key);

vs.

elog(INFO, connection: host=%s user=%s database=%s, ...);

There are maybe a dozen potential INFO messages, plus a few to be
converted fprintf's.

-- 
Peter Eisentraut   [EMAIL PROTECTED]   http://funkturm.homeip.net/~peter


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] Lisp as procedural language

2001-05-05 Thread Peter Eisentraut


Vladimir V. Zolotych writes:

 I see the following

 proba= select * from pg_language;
 lanname |lanispl|lanpltrusted|lanplcallfoid|lancompiler
 +---++-+--
 internal|f  |f   |0|n/a
 lisp|f  |f   |0|/usr/ucb/liszt
[...]

This must have been an artifact from the time when part of the Postgres
system was written in Lisp.  A Lisp procedural language never actually
existed in PostgreSQL.

-- 
Peter Eisentraut   [EMAIL PROTECTED]   http://funkturm.homeip.net/~peter


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] File system performance and pg_xlog

2001-05-05 Thread mlw


Marko Kreen wrote:
 
 On Sat, May 05, 2001 at 01:09:38PM -0400, mlw wrote:
  A small debate started with bad performance on ReiserFS. I pondered the likely
  advantages to raw device access. It also occured to me that the FAT file system
  is about as close to a managed raw device as one could get. So I did some
  tests:
 
  /sdb1 is either an ext2 or FAT file system used as pg_xlog with a symlink.
 
 One little thought: does mounting ext2 with 'noatime' makes any
 difference?  AFAIK fat does not have concept of atime, so then
 it would be more fair?  Just a thought.
 
 --
 marko

I don't know, and I haven't tried that, but I suspect that it won't make much
difference. 

While I do not think that anyone would seriously consider using FAT for xlog,
I'd have problems considering myself, it in a production environment, the
numbers do say something about the nature of WAL. A bunch of files, all the
same size, is practically what FAT does best. Plus there is no real overhead.

The very reasons why FAT is a POS file system are the same reasons it would
work great for WAL, with the only caveat being that fsync is implemented, and
the application (postgres) maintains its own data integrity.

Oddly enough, I did not see any performance improvement using FAT for the
base directory. That may be the nature of the pg block size vs cluster size,
fragmentation, and stuff. If I get some time I will investigate it a bit more.

Clearly not everyone would be interested in this. PG seems to be used for
everything from a small personal db, to a system component db -- like on a web
box, to a full blown stand-alone server. The first two applications may not be
interested in this sort of stuff, but last category, the full blown server
would certainly want to squeeze as much out of their system as possible.

I think a pgfs could easily be a derivative of FAT, or even FAT with some
Ioctls.  It is simple, it is fast, it does not attempt to do things postgres
doesn't need.

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] File system performance and pg_xlog

2001-05-05 Thread mlw


Marko Kreen wrote:
 
 On Sat, May 05, 2001 at 06:43:51PM -0400, mlw wrote:
  Marko Kreen wrote:
   On Sat, May 05, 2001 at 01:09:38PM -0400, mlw wrote:
A small debate started with bad performance on ReiserFS. I pondered the likely
advantages to raw device access. It also occured to me that the FAT file system
is about as close to a managed raw device as one could get. So I did some
tests:
 
  I think a pgfs could easily be a derivative of FAT, or even FAT with some
  Ioctls.  It is simple, it is fast, it does not attempt to do things postgres
  doesn't need.
 
 Well, my opinion too is that it is waste of resources to try
 implement PostgreSQL-specific filesystem.  As you already showed
 that there are noticeable differences of different filesystems,
 the Right Thing would be to make a FAQ/web-page/knowledge-base
 of comments on different filesystem in point of view of DB
 (PostgreSQL) server.
 
 Also users will have different priorities:
 reliability/speed-of-reads/speed-of-writes - I mean different
 users have them ordered differently - so it should be mentioned
 this fs is good for this but bad on this, etc...  It is good
 to put this part of db on this fs but not that part of db...
 Suggestions on mount flags to use...

I think it is simpler problem than that. Postgres, with fsync enabled, does a
lot of work trying to maintain data integrity. It is logical to conclude that a
file system that does as little as possible would almost always perform better.
Regardless of what the file system does, eventually it writes blocks of data to
sectors on a disk.

Many databases use their own data volume management. I am not suggesting that
anyone create a new file system, but after performing some tests, I am really
starting to see why products like oracle manage their own table spaces.

If one looks at the FAT file system with an open mind and a clear understanding
of how it will be used, some small modifications may make it the functional
equivalent of a managed table space volume, at least under Linux.

Some of the benchmark numbers are hovering around 20% improvement! That's
nothing to sneeze at. I have a database loader that does a select nextval(..)
followed by a begin, a series of inserts, followed by a commit.

With xlog on a FAT file system, I can get 53-60 sets per second. With Xlog
sitting on ext2, I can get 40-45 sets per second. (Of the same data) These are
not insignificant improvements, and should be examined. If not from a Postgres
development perspective, at least from a deployment perspective.

 
 There already exist bazillion filesystems, _some_ of them should
 be usable for PostgreSQL too :)

I agree.


-- 
I'm not offering myself as an example; every life evolves by its own laws.

http://www.mohawksoft.com

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] Lisp as procedural language

2001-05-05 Thread Tom Lane


Peter Eisentraut [EMAIL PROTECTED] writes:
 This must have been an artifact from the time when part of the Postgres
 system was written in Lisp.  A Lisp procedural language never actually
 existed in PostgreSQL.

[ Digs in archives... ]  The pg_language entry that Vladimir refers to
was still present as late as Postgres 6.5 --- but I agree that it must
have been vestigial long before that.  Certainly, at one time large
chunks of Postgres *were* written in Lisp, and I imagine that the
pg_language entry did something useful when that was true.  But it was
dead code in Postgres 4.2 (1994), which is the oldest source I have;
there is no Lisp code remaining in 4.2.

It'd theoretically be possible to support Lisp in the same way as we
currently support Tcl, Perl, etc.  The hard part is to find a suitable
interpreter that is designed to be dynamically linked into other
applications.  Perl still hasn't got that quite right, and I imagine
it's an even more foreign idea for most Lisp systems...

regards, tom lane

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly

[HACKERS] v7.1.1 Branched, Packaged and Released ...

2001-05-05 Thread The Hermit Hacker



This is just a quick announcement that we have now branched off v7.1.x
from the main development tree, and are starting to dive into development
of v7.2 ...

There have been several changes since v7.1 was released, including:

Fix for numeric MODULO operator (Tom)
pg_dump fixes (Philip)
pg_dump can dump 7.0 databases (Philip)
readline 4.2 fixes (Peter E)
JOIN fixes (Tom)
AIX, MSWIN, VAX,N32K fixes (Tom)
Multibytes fixes (Tom)
Unicode fixes (Tatsuo)
Optimizer improvements (Tom)
Fix for whole tuples in functions (Tom)
Fix for pg_ctl and option strings with spaces (Peter E)
ODBC fixes (Hiroshi)
EXTRACT can now take string argument (Thomas)
Python fixes (Darcy)

With more details available in the ChangeLog file ...

This release does not require a dump/restore from v7.1, it is purely a
maintaince release ...

Any bugs please report them to [EMAIL PROTECTED] ...

RPMs and DEBs should be available soon ...

Marc G. Fournier   ICQ#7615664   IRC Nick: Scrappy
Systems Administrator @ hub.org
primary: [EMAIL PROTECTED]   secondary: scrappy@{freebsd|postgresql}.org


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

[HACKERS] GiST indexing problems...

[HACKERS] elog(LOG), elog(DEBUG)

Re: [HACKERS] GiST indexing problems...

Re: [HACKERS] GiST indexing problems...

[HACKERS] Lisp as procedural language

Re: [HACKERS] GiST indexing problems...

Re: [HACKERS] elog(LOG), elog(DEBUG)

Re: [HACKERS] GiST indexing problems...

Re: [HACKERS] GiST indexing problems...

[HACKERS] typo in psql's help

Re: [HACKERS] GiST indexing problems...

[HACKERS] Re: New Linux xfs/reiser file systems

[HACKERS] File system performance and pg_xlog

[HACKERS] Re: New Linux xfs/reiser file systems

Re: [HACKERS] New Linux xfs/reiser file systems

Re: [HACKERS] File system performance and pg_xlog

Re: [HACKERS] Lisp as procedural language

Re: [HACKERS] elog(LOG), elog(DEBUG)

Re: [HACKERS] Lisp as procedural language

Re: [HACKERS] File system performance and pg_xlog

Re: [HACKERS] File system performance and pg_xlog

Re: [HACKERS] Lisp as procedural language

[HACKERS] v7.1.1 Branched, Packaged and Released ...

23 matches

Site Navigation

Mail list logo

Footer information