Re: [HACKERS] [GENERAL] Performance while loading data and indexing

2002-09-26 Thread Martijn van Oosterhout

I'll preface this by saying that while I have a large database, it doesn't
require quite the performace you're talking about here.

On Thu, Sep 26, 2002 at 02:05:44PM +0530, Shridhar Daithankar wrote:
 1) Database load time from flat file using copy is very high
 2) Creating index takes huge amount of time.
 3) Any suggsestions for runtime as data load and query will be going in 
 parallel.

You're loading all the data in one copy. I find that INSERTs are mostly
limited by indexes. While index lookups are cheap, they are not free and
each index needs to be updated for each row.

I fond using partial indexes to only index the rows you actually use can
help with the loading. It's a bit obscure though.

As for parallel loading, you'll be limited mostly by your I/O bandwidth.
Have you measured it to take sure it's up to speed?

 Now the details. Note that this is a test run only..
 
 Platform:- 4x Xeon2.4GHz/4GB RAM/4x48 SCSI RAID5/72 GB SCSI
 RedHat7.2/PostgreSQL7.1.3
 
 Database in flat file: 
 125,000,000 records of around 100 bytes each. 
 Flat file size 12GB
 
 Load time: 14581 sec/~8600 rows persec/~ an MB of data per sec.
 Create unique composite index on 2 char and a timestamp field:  25226 sec.
 Database size on disk: 26GB
 Select query: 1.5 sec. for approx. 150 rows.

So you're loading at a rate of 860KB per sec. That's not too fast. How many
indexes are active at that time? Triggers and foreign keys also take their
toll.

 Important postgresql.conf settings
 
 sort_mem = 12000
 shared_buffers = 24000
 fsync=true (Sad but true. Left untouched.. Will that make a difference on 
 SCSI?)
 wal_buffers = 65536 
 wal_files = 64 

fsync IIRC only affects the WAL buffers now but it may be quite expensive,
especially considering it's running on every transaction commit. Oh, your
WAL files are on a seperate disk from the data?

 Initial flat data load: 250GB of data. This has gone up since last query. It 
 was 150GB earlier..
 Ongoing inserts: 5000/sec. 
 Number of queries: 4800 queries/hour
 Query response time: 10 sec.

That looks quite acheivable.

 1)  Instead of copying from a single 12GB data file,  will a parallel copy from 
 say 5 files will speed up the things? 

Limited by I/O bandwidth. On linux vmstat can tell you how many blocks are
being loaded and stored per second. Try it. As long as sync() doesn't get
done too often, it should be help.

 Couple MB of data per sec. to disk is just not saturating it. It's a RAID 5 
 setup..

No, it's not. You should be able to do better.

 2) Sort mem.=12K i.e. 94MB, sounds good enough to me. Does this need further 
 addition to improve create index performance?

Should be fine. Admittedly your indexes are taking rather long to build.

 3) 5K concurrent inserts with an index on, will this need a additional CPU 
 power? Like deploying it on dual RISC CPUs etc? 

It shouldn't. Do you have an idea of what your CPU usage is? ps aux should
give you a decent idea.

 4) Query performance is not a problem. Though 4.8K queries per sec. expected 
 response time from each query is 10 sec. But my guess is some serius CPU power 
 will be chewed there too..

Should be fine.

 5)Will upgrading to 7.2.2/7.3 beta help?

Possibly, though it may be wirth it just for the features/bugfixes.

 All in all, in the  test, we didn't see the performance where hardware is 
 saturated to it's limits. So effectively we are not able to get postgresql 
 making use of it. Just pushing WAL and shared buffers does not seem to be the 
 solution.
 
 If you guys have any suggestions. let me know.  I need them all..

Find the bottleneck: CPU, I/O or memory?

 Mysql is almost out because it's creating index for last 17 hours. I don't 
 think it will keep up with 5K inserts per sec. with index. SAP DB is under 
 evaluation too. But postgresql is most favourite as of now because it works. So 
 I need to come up with solutions to problems that will occur in near future..
 ;-)

17 hours! Ouch. Either way, you should be able to do much better. Hope this
helps,
-- 
Martijn van Oosterhout   [EMAIL PROTECTED]   http://svana.org/kleptog/
 There are 10 kinds of people in the world, those that can do binary
 arithmetic and those that can't.

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [GENERAL] [HACKERS] Performance while loading data and indexing

2002-09-26 Thread Martijn van Oosterhout

On Thu, Sep 26, 2002 at 03:01:35PM +0530, Shridhar Daithankar wrote:
Content-Description: Mail message body
 The index creation query was
 
 CREATE INDEX index1 ON tablename (esn,min,datetime);
 
 What if I put datetime ahead? It's likely the the datetime field will have high 
 degree of locality being log data..

The order of fields depends on what you're using it for. For example, you
can use the above index for a query using the conditions:

esn = 'aaa' 
esn = 'bbb' and min = 'xxx'

but not for queries with only

datetime = '2002-09-26'
min = 'ddd' and datetime = '2002-10-02'

The fields can only be used left to right. This is where a single
multicolumn index differs from multiple indexes of different columns.

Have you used EXPLAIN ANALYSE to determine whether your indexes are being
used optimally?

-- 
Martijn van Oosterhout   [EMAIL PROTECTED]   http://svana.org/kleptog/
 There are 10 kinds of people in the world, those that can do binary
 arithmetic and those that can't.

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] Rép

2002-10-08 Thread Martijn van Oosterhout

On Tue, Oct 08, 2002 at 12:35:22PM +0200, Erwan DUROSELLE wrote:
 What I understood from the Administrator's guide is:
 
   - Yes, PostgreSQL provides hot backup: it's the pg_dump utility. It'h
 hot because users can still be connected and work whil pg_dump is running
 ( though they will be slowed down). ( See Administrator's guide ch9)

Correct.

  -  No, PostgreSQL does NOT provide a way to restore a database up to the
 last commited transaction, with a reapply of the WAL, as Oracle or SQL
 Server ( and others, I guess) do. That would be a VERY good feature. See
 Administrator's guide ch11

Umm, I thought the whole point of WAL was that if the database crashed, the
WAL would provide the info to replay to the last committed transaction.

http://www.postgresql.org/idocs/index.php?wal.html

... because we know that in the event of a crash we will be able to recover
the database using the log: ...

These docs seem to corrobrate this.

 So, with Pg, if you backup your db every night with pg_dump, and your
 server crashes during the day, you will loose up to one day of work.

I've never lost any data with postgres, even if it's crashed, even without
WAL.

-- 
Martijn van Oosterhout   [EMAIL PROTECTED]   http://svana.org/kleptog/
 There are 10 kinds of people in the world, those that can do binary
 arithmetic and those that can't.

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



Re: [HACKERS] [GENERAL] Large databases, performance

2002-10-08 Thread Martijn van Oosterhout

On Mon, Oct 07, 2002 at 05:42:12PM +0200, Zeugswetter Andreas SB SD wrote:
  Hackers, do you think it's possible to hack together a quick and dirty
  patch, so that string length is represented by one byte?  IOW can a
  database be built that doesn't contain any char/varchar/text value
  longer than 255 characters in the catalog?
 
 Since he is only using fixchar how about doing a fixchar implemetation, that 
 does not store length at all ? It is the same for every row anyways !

Remember that in Unicode, 1 char != 1 byte. In fact, any encoding that's not
Latin will have a problem. I guess you could put a warning on it: not for
use for asian character sets. So what do you do if someone tries to insert
such a string anyway?

Perhaps a better approach is to vary the number of bytes used for the
length. So one byte for lengths  64, two bytes for lengths  16384.
Unfortunatly, two bits in the length are already used (IIRC) for other
things making it a bit more tricky.
-- 
Martijn van Oosterhout   [EMAIL PROTECTED]   http://svana.org/kleptog/
 There are 10 kinds of people in the world, those that can do binary
 arithmetic and those that can't.

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] [pgsql-performance] [GENERAL] Large databases, performance

2002-10-08 Thread Martijn van Oosterhout

On Tue, Oct 08, 2002 at 11:14:11AM +0530, Shridhar Daithankar wrote:
 On 7 Oct 2002 at 11:21, Tom Lane wrote:
 
  Shridhar Daithankar [EMAIL PROTECTED] writes:
   I say if it's a char field, there should be no indicator of length as
   it's not required. Just store those many characters straight ahead..
  
  Your assumption fails when considering UNICODE or other multibyte
  character encodings.
 
 Correct but is it possible to have real char string when database is not 
 unicode or when locale defines size of char, to be exact?
 
 In my case varchar does not make sense as all strings are guaranteed to be of 
 defined length. While the argument you have put is correct, it's causing a disk 
 space leak, to say so.

Well, maybe. But since 7.1 or so char() and varchar() simply became text
with some length restrictions. This was one of the reasons. It also
simplified a lot of code.
-- 
Martijn van Oosterhout   [EMAIL PROTECTED]   http://svana.org/kleptog/
 There are 10 kinds of people in the world, those that can do binary
 arithmetic and those that can't.

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [HACKERS] [GENERAL] I cant find it or I'm just lazy ?

2003-02-24 Thread Martijn van Oosterhout
On Mon, Feb 24, 2003 at 07:53:05PM +, Darko Prenosil wrote:
   I need two answers I did not find in documentation :
 
 How can I get exact number of rows in DECLARED CURSOR ?
 OK, I can FETCH until NULL, but this does not fits my needs !

You need to move to the end of the cursor. When you declare a cursor it
doesn't run the query yet. You have to tell it to run the query before it
can tell you how many rows it is. I think the command is MOVE.

-- 
Martijn van Oosterhout   [EMAIL PROTECTED]   http://svana.org/kleptog/
 Support bacteria! They're the only culture some people have.


pgp0.pgp
Description: PGP signature


Re: [HACKERS] On-Disk Tuple Size

2002-04-20 Thread Martijn van Oosterhout

On Sat, Apr 20, 2002 at 05:22:20PM +0900, Curt Sampson wrote:
  ...and 6 bytes for the CTID, which I guess may be unnecessary.
 
 Really? How would things work without it?

Well, from my examination of the on-disk data the CTID stored there is the
same as its location in the file, so it could just be filled in while
reading.

Unless I'm misunderstanding its purpose.

-- 
Martijn van Oosterhout   [EMAIL PROTECTED]   http://svana.org/kleptog/
 Canada, Mexico, and Australia form the Axis of Nations That
 Are Actually Quite Nice But Secretly Have Nasty Thoughts About America

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



[HACKERS] Documentation on page files

2002-04-20 Thread Martijn van Oosterhout

[Please CC any replies, I'm subscribed nomail]

Hi,

Chapter 7 of the Developers guide in about the Page Format on disk and it's
a little out of date not to mention somewhat incomplete.

1. Is there documentation elsewhere (other than the source)?

2. If not, would patches be accepted to correct the situation? I've been
looking into it a bit recently so I think I may be able to whip something
useful up.

Thanks in advance,
-- 
Martijn van Oosterhout   [EMAIL PROTECTED]   http://svana.org/kleptog/
 Canada, Mexico, and Australia form the Axis of Nations That
 Are Actually Quite Nice But Secretly Have Nasty Thoughts About America

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



Re: [HACKERS] Documentation on page files

2002-04-21 Thread Martijn van Oosterhout

On Sun, Apr 21, 2002 at 03:46:07PM +0900, Curt Sampson wrote:
 On Sat, 20 Apr 2002, Tom Lane wrote:
 
  Martijn van Oosterhout [EMAIL PROTECTED] writes:
   2. If not, would patches be accepted to correct the situation?
 
  Go for it.
 
 Yes, please! I'd be happy to review and updated version.

Ok, my first attempt can be seen here:

http://svana.org/kleptog/pgsql/page.sgml.txt

I don't know whatever SGML format this is using, so the layout is not great,
but the information should be accurate. I used it to create a program to
dump the datafiles directly without the postmaster :).

I'll submit a proper patch once we have something useful.

 One thing that would be good, since this is a developers' guide,
 would be to include references to the source files and dates from
 which the information comes. That way one could see if updates are
 necessary by doing a diff on those files between the given date
 and the head, to see what changes have been made since the description
 was written. Also good would be to have the data structures explicitly
 named so that when one dives into the source, one already has a
 good idea of what one's looking at.

Well, I have included the names of the structures involved. Do you think
it's worth adding filenames given that TAGS makes tracking them down easily
enough? I can put in dates if you like.

Issues to be dealt with:
- Do I need to say more about TOAST?
- Indexes?
- Split into sections
- How much detail is enough/too much?

Have a nice day,
-- 
Martijn van Oosterhout   [EMAIL PROTECTED]   http://svana.org/kleptog/
 Canada, Mexico, and Australia form the Axis of Nations That
 Are Actually Quite Nice But Secretly Have Nasty Thoughts About America

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] Documentation on page files

2002-04-22 Thread Martijn van Oosterhout

On Mon, Apr 22, 2002 at 11:14:36AM -0500, Ross J. Reedstrom wrote:
 On Sun, Apr 21, 2002 at 07:28:32PM +1000, Martijn van Oosterhout wrote:
  
  http://svana.org/kleptog/pgsql/page.sgml.txt
  
  I don't know whatever SGML format this is using, so the layout is not great,
  but the information should be accurate. I used it to create a program to
  dump the datafiles directly without the postmaster :).
 
 Excellent - since this is a FRP (Frequently Requested Program) how do you
 feel about dumping it in contrib? Even if it's hardcoded for your particular
 table structure, it could serve as a starting point for some poor DBA
 who's got to recover from a lost xlog, for example.

Actually, it reads the table structure from the catalog. It also will find
the right files to open. It reads files from both PG 6.5 and 7.2 although it
shouldn't be too hard to make work for other versions. And if you people
don't reorder the first few fields in pg_attribute, it will work for all
future versions too.

The dumping is more of an extra, the original idea was to check for errors
in the datafiles. Hence the working name of pgfsck. At the moment the
dumping dumps only tuples where xmax == 0 but I'm not sure if that's
correct.

It doesn't handle compressed tuples nor toasted ones, though thats more
advanced really. And ofcourse outputing data in human readable format has to
be added for each type. I only started writing it on Sunday, so let me give
it a usable interface and I'll let people try it out.

-- 
Martijn van Oosterhout   [EMAIL PROTECTED]   http://svana.org/kleptog/
 Canada, Mexico, and Australia form the Axis of Nations That
 Are Actually Quite Nice But Secretly Have Nasty Thoughts About America

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [HACKERS] Documentation on page files

2002-04-23 Thread Martijn van Oosterhout

On Tue, Apr 23, 2002 at 09:15:22AM +0200, Hannu Krosing wrote:
 On Tue, 2002-04-23 at 01:29, Martijn van Oosterhout wrote:
  
  The dumping is more of an extra, the original idea was to check for errors
  in the datafiles. Hence the working name of pgfsck. At the moment the
  dumping dumps only tuples where xmax == 0 but I'm not sure if that's
  correct.
 
 AFAIK it is not. As Tom once explained me, it is ok for tuples xmax to
 be !=0 and still have a valid tuple. The validity is determined by some
 bits in tuple header.

Well, from my thinking about how you would use these fields in a logical
way, it seems it's possible for xmax to be non-zero if the transaction
numbered xmax was not committed. But in that case (unless it was a delete)
there would be a newer tuple with the same oid but xmax == 0 (and this
uncommitted transaction as xmin).

The problem is that inside the DB, you have a current transaction plus a
list of committed transactions. Externally, you have no idea, so xmax == 0
is as valid a view as any other. This would have the effect of dumping out
whatever would be visible if every transaction were committed.

I think. If anyone knows a good document on MVCC implementations, let me
know.

 But I think the most useful behaviour should be to dump system fields
 too, so mildly knowledgeable sysadmin can import the dump and do the
 right thing afterwards (like restore data as it was before transaction
 nr 7000)

Well, i didn't think you could have statements of the form:

insert into table (xmin,xmax,cmin,cmax,...) values (...);

So you would have to leave it as a comment. In which case someone would have
to go and by hand work out what would be in or out. I can make it an option
but I don't think it would be particularly useful. Maybe
--pretend-uncommitted xact

Just a thought, if I did a delete from table accedently, and stopped the
postmaster and twiddled the xlog for that transaction, would that have the
effect of undeleting those tuples?
-- 
Martijn van Oosterhout   [EMAIL PROTECTED]   http://svana.org/kleptog/
 Canada, Mexico, and Australia form the Axis of Nations That
 Are Actually Quite Nice But Secretly Have Nasty Thoughts About America

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



[HACKERS] Table checking/dumping program

2002-04-24 Thread Martijn van Oosterhout

[Please CC any replies, I'm subscribed nomail]

As promised I've given it a bit of polish and it's actually almost useful.
You can have a look at it http://svana.org/kleptog/pgsql/pgfsck.html

Just unpack the files into a directory. It's just a perl script with two
modules so no compiling necessary. You can download the package directly at
http://svana.org/kleptog/pgsql/pgfsck-0.01.tar.gz

I've tested it on versions 6.5, 7.0 and 7.2 and it works. It shouldn't
crash, no matter how bad a file you feed it. It can output insert statements
also to help reconstruction.

Here is an example of the program being run over a suitably hexedited file.
# ./pgfsck -r 16559 kleptog website
-- Detected database format 7.2
-- Table pg_class(1259):Page 1:Tuple 0: Unknown type _aclitem (1034)
-- Table pg_class(1259):Page 1:Tuple 49: Unknown type _aclitem (1034)
-- Table website(16559):Page 0:Tuple 7: Tuple incorrect length (parsed 
data=57,length=1638)
-- Table website(16559):Page 0:Tuple 44: Decoding tuple runs off end: 627338916  69
-- Table website(16559):Page 0:Tuple 70: Bad tuple offset. Should be: 3784 = 11592  
8192

Currently the following features are not supported:

- Toasted / compressed tuples
- Checking indexes doesn't work (should it?)
- Views just produce empty output (because they are)
- Arrays don't work
- Since each type output has to be written, many types are not correctly output
- Split tables (1GB) are not supported past the first part.
- Some system tables in some versions have a strange layout. You may get many
  harmless warnings about the formats of pg_class, pg_attribute and/or pg_type.

Most of these are basically because I don't know how they work, but with a
bit of work some of these should be fixable.

Have a nice day,
-- 
Martijn van Oosterhout   [EMAIL PROTECTED]   http://svana.org/kleptog/
 Canada, Mexico, and Australia form the Axis of Nations That
 Are Actually Quite Nice But Secretly Have Nasty Thoughts About America

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [GENERAL] [HACKERS] Linux Largefile Support In Postgresql RPMS

2002-08-12 Thread Martijn van Oosterhout

On Mon, Aug 12, 2002 at 11:30:36AM -0400, Andrew Sullivan wrote:
 The problem is not just a system-level one, but a filesystem-level
 one.  Enabling 64 bits by default might be dangerous, because a DBA
 might think oh, it supports largefiles by default and therefore not
 notice that the filesystem itself is not mounted with largefile
 support.  But I suspect that the developers would welcome autoconfig
 patches if someone offered them.

Are there any filesystems in common use (not including windows ones) that
don't support 32-bit filesizes?

Linux (ext2) I know supports by default at least to 2TB (2^32 x 512bytes),
probably much more. What about the BSDs? XFS? etc

-- 
Martijn van Oosterhout   [EMAIL PROTECTED]   http://svana.org/kleptog/
 There are 10 kinds of people in the world, those that can do binary
 arithmetic and those that can't.

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [GENERAL] [HACKERS] Linux Largefile Support In Postgresql RPMS

2002-08-13 Thread Martijn van Oosterhout

On Tue, Aug 13, 2002 at 08:02:05AM -0500, Larry Rosenman wrote:
 On Tue, 2002-08-13 at 03:42, Mark Kirkwood wrote:
  Other operating systems where 64 bit file access can be disabled or 
  unconfigured require more care - possibly  (sigh) 2 binary RPMS with a 
  distinctive 32 and 64 bit label ...(I think the big O does this for 
  Solaris).
 Then, of course, there are systems where Largefiles support is a
 filesystem by filesystem  (read mountpoint by mountpoint) option (E.G.
 OpenUNIX). 
 
 I think this is going to be a pandoras box. 

I don't understand. Why would you want large-file support enabled on a
per-filesystem basis? All your system programs would have to support the
lowest common denomitor (ie, with large file support). Is it to make the
kernel enforce a limit for the purposes of compatability?

I'd suggest making it as simple as --enable-large-files and make it default
in a year or two.

-- 
Martijn van Oosterhout   [EMAIL PROTECTED]   http://svana.org/kleptog/
 There are 10 kinds of people in the world, those that can do binary
 arithmetic and those that can't.

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])



Re: [HACKERS] Re: [PATCH] To remove EXTEND INDEX

2001-07-13 Thread Martijn van Oosterhout

On Fri, Jul 13, 2001 at 05:49:56PM -0400, Tom Lane wrote:
 Let's drop the meta-discussions and cut to the chase: given that we are
 about to re-enable partial indexes, should we try to make EXTEND INDEX
 work too, or just remove it?

Just a few clarifications:

* The reason it didn't go to -hackers was because I wasn't subscribed to it
and hence couldn't post to it. The only reason I can now is because I
subscribed (nopost) about 2 minutes ago.

* I discussed this with Tom Lane on -general a few days ago. I'm not sure
how many people saw that though. Are most of the people on -hackers
subscribed to -general as well?

* I agree with Tom's assertion that it's an awful lot of complexity for such
a marginal gain. Look at the size of the patch and the fact that it has all
been useless for the last few years.

* I didn't send it to -patches because it's not ready yet.

* Only posted a URL, not the patch itself. Sorry for the confusion.

Tom actually suggested doing this at the same time as re-enabling partial
indices but I favoured a separate patch considering the large number of
scattered changes.

Anyway, is there a concensus, or shall I forget the whole thing?

-- 
Martijn van Oosterhout [EMAIL PROTECTED]
http://svana.org/kleptog/
 It would be nice if someone came up with a certification system that
 actually separated those who can barely regurgitate what they crammed over
 the last few weeks from those who command secret ninja networking powers.

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])



Re: [HACKERS] Re: [PATCH] To remove EXTEND INDEX

2001-07-13 Thread Martijn van Oosterhout

On Fri, Jul 13, 2001 at 06:34:22PM -0400, Bruce Momjian wrote:
  Let's drop the meta-discussions and cut to the chase: given that we are
  about to re-enable partial indexes, should we try to make EXTEND INDEX
  work too, or just remove it?
 
 We don't let people add columns to an existing index so I don't see why
 we should have EXTEND INDEX unless index twiddling is more common with
 partial indexes.

We don't allow people currently to fiddle with indices at all. I don't
understand the origin of EXTEND INDEX since I can't think of a situation
where it would actually be useful.

-- 
Martijn van Oosterhout [EMAIL PROTECTED]
http://svana.org/kleptog/
 It would be nice if someone came up with a certification system that
 actually separated those who can barely regurgitate what they crammed over
 the last few weeks from those who command secret ninja networking powers.

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



[HACKERS] Radical suggestion for plan executor?

2001-07-13 Thread Martijn van Oosterhout

I notice that the query executor currently has a lot of switch statements on
the the type of node it is descending to. This means you get a call tree
like:

ExecProcNode
  ExecNestLoop
ExecProcNode
  ExecMergeJoin
...

Wouldn't it be nicer if the Plan had access to function pointers that
already referred to the right function. So instead of:

result = ExecProcNode( a, b )

you get:

a-procs.exec( b );

It compresses the call tree down a bit. However, I'm not sure if it has many
benefits other than maintainability.

OTOH, you could keep ExecProcNode and just replace the switch with a
function call.

Any thoughts?
-- 
Martijn van Oosterhout [EMAIL PROTECTED]
http://svana.org/kleptog/
 It would be nice if someone came up with a certification system that
 actually separated those who can barely regurgitate what they crammed over
 the last few weeks from those who command secret ninja networking powers.

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



[HACKERS] Determining return type of polymorphic function

2005-08-11 Thread Martijn van Oosterhout
[Please CC any replies so I don't have to follow them via the archives]

Hi,

I'm trying to create a set of types that are going to share the INPUT
and OUTPUT functions (written in C). For output you can determine the
type from the arguments, but for INPUT you can't. The prototype is
restricted (by CREATE TYPE) and you can't specify anyelement as the
return type because none of the arguments use it.

My current way around that is to create an alias to the function with
different names for each type, but get_fn_expr_rettype() doesn't appear
to be filled in anyway (fcinfo-flinfo-fn_expr == NULL).

What I'm trying to do now is use fcinfo-flinfo-fn_oid to lookup
pg_proc and get the return type from there, but something tells me
there must be an easier way.

Or to put it another way, if I define a function like:

CREATE FUNCTION myfunction(cstring, oid, integer) 
RETURNS mytype AS 'mylib.so' LANGUAGE 'C';

How can I determine I'm supposed to return a mytype? I'm running 7.4
if it matters...

Thanks in advance,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgpw0xkpyZUhh.pgp
Description: PGP signature


Re: [HACKERS] Determining return type of polymorphic function

2005-08-11 Thread Martijn van Oosterhout
[Please CC replies, thanks]

On Thu, Aug 11, 2005 at 02:17:30PM -0400, Tom Lane wrote:
 Martijn van Oosterhout kleptog@svana.org writes:
  What I'm trying to do now is use fcinfo-flinfo-fn_oid to lookup
  pg_proc and get the return type from there, but something tells me
  there must be an easier way.
 
 No, I think you're stuck.  The internal calls for type I/O routines
 don't set up fn_expr (since there is no expression tree).
 
 One possibility, depending on your time horizon for this, is to change
 the getTypeIOParam rules so that ordinary types get their own OID as
 second argument.

Hmm, I was thinking about that. While reading the documentation I was
thinking surely they'd pass their own oid, giving zero would be silly
so I was kind of surprised when I did get zero.

I was thinking of actually also storing the oid in the typelem field
but the docs imply this does something fancy with subscripting. I
havn't traced the code paths for that yet. At the very least I think it
would confuse anything looking for arrays. I also thought about typmod
(the third argument) but that seems to almost always be -1.

Would a patch to change the rules be accepted, or would it be
considered a unnecessary backward incompatable change?

Thanks in advance,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgpi9FwgBzPbd.pgp
Description: PGP signature


Re: [HACKERS] Determining return type of polymorphic function

2005-08-12 Thread Martijn van Oosterhout
On Thu, Aug 11, 2005 at 02:51:11PM -0400, Tom Lane wrote:
  Would a patch to change the rules be accepted, or would it be
  considered a unnecessary backward incompatable change?
 
 I wouldn't back-patch it, but it seems like something we could still put
 in for 8.1.

Ok, here's a patch (with documentation update). I checked the
regression tests (looked over, not run) but nothing there appears to
test this anyway. I looked through all the datatype input functions but
none of them even use the second argument except array and record types
and they're explicitly unchanged.

Note: the logic could be simplified if we could assume composite types
can't have a non-zero typelem. From looking at the code, I think it may
be assumed in places and I'm fairly sure it's non-sensical, but is it
explicitly forbidden?

I thought of writing a few simple tests but no language will accept
cstring arguments except C. It can be added if you think it's worth
regression testing.

Unless there are other comments I'll post this to pgsql-patches
later...

Have a nice day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgp8zqGHA6npj.pgp
Description: PGP signature


Re: [HACKERS] [PATCH] Determining return type of polymorphic function

2005-08-12 Thread Martijn van Oosterhout
Forgot to attach it, oops.

On Fri, Aug 12, 2005 at 09:56:47AM +0200, Martijn van Oosterhout wrote:
 Ok, here's a patch (with documentation update). I checked the
 regression tests (looked over, not run) but nothing there appears to
 test this anyway. I looked through all the datatype input functions but
 none of them even use the second argument except array and record types
 and they're explicitly unchanged.
 
 Note: the logic could be simplified if we could assume composite types
 can't have a non-zero typelem. From looking at the code, I think it may
 be assumed in places and I'm fairly sure it's non-sensical, but is it
 explicitly forbidden?
 
 I thought of writing a few simple tests but no language will accept
 cstring arguments except C. It can be added if you think it's worth
 regression testing.
 
 Unless there are other comments I'll post this to pgsql-patches
 later...

-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.
Index: doc/src/sgml/ref/create_type.sgml
===
RCS file: /projects/cvsroot/pgsql/doc/src/sgml/ref/create_type.sgml,v
retrieving revision 1.57
diff -u -r1.57 create_type.sgml
--- doc/src/sgml/ref/create_type.sgml   14 Jul 2005 06:17:36 -  1.57
+++ doc/src/sgml/ref/create_type.sgml   12 Aug 2005 12:52:17 -
@@ -102,9 +102,8 @@
declared as taking one argument of type typecstring/type,
or as taking three arguments of types
typecstring/type, typeoid/type, typeinteger/type.
-   The first argument is the input text as a C string, the second
-   argument is the element type's OID in case this is an array type
-   (or the type's own OID for a composite type),
+   The first argument is the input text as a C string, the second argument is 
+   the OID of the type, except for arrays where it is the element type's OID
and the third is the literaltypmod/ of the destination column, if known
(-1 will be passed if not).
The input function must return a value of the data type itself.
Index: src/backend/utils/cache/lsyscache.c
===
RCS file: /projects/cvsroot/pgsql/src/backend/utils/cache/lsyscache.c,v
retrieving revision 1.116
diff -u -r1.116 lsyscache.c
--- src/backend/utils/cache/lsyscache.c 29 Aug 2004 05:06:50 -  1.116
+++ src/backend/utils/cache/lsyscache.c 12 Aug 2005 12:52:17 -
@@ -1220,12 +1220,14 @@
 
/*
 * Composite types get their own OID as parameter; array types get
-* their typelem as parameter; everybody else gets zero.
+* their typelem as parameter; everybody else gets their own oid.
 */
if (typeStruct-typtype == 'c')
return HeapTupleGetOid(typeTuple);
-   else
+   else if( typeStruct-typelem != 0 )
return typeStruct-typelem;
+   else
+   return HeapTupleGetOid(typeTuple);
 }
 
 /*


pgpO2H7dLlWm6.pgp
Description: PGP signature


[HACKERS] SPI: ERROR: no snapshot has been set

2005-08-13 Thread Martijn van Oosterhout
[Please CC any replies, thanks]

Hi,

I got this nice error, and according to Tom Lane[1]:

 Depending on what PG version you're using (which you did not say,
 naughty naughty) there are paths in interactive query entry that
 might try to execute datatype input functions before setting the
 query snapshot. But I don't believe it can happen down inside SPI.

[1] http://www.dbforums.com/archive/index.php/t-1065541.html

Good guess, I am inside a type input function trying to use SPI. My
questions are:

- I'm using 7.4, is this fixed/changed in later versions?
- I'm triggering it with: select 'hello'::mytype;

  Is there an easy way of testing my function without triggering this?

select (select 'hello'::mytype);-- Same problem
select (select 'hello')::mytype -- gives something about casting unknown
\copy .. from stdin -- Works but is inconvenient

- Is there a way to detect if a snapshot has been started and creating
one if there hasn't. I've seen the snapshot related functions but I
don't feel comfortable sticking them in my type input function just to
fix this. Is it safe?

Thanks in advance,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgpqb0TheSA5J.pgp
Description: PGP signature


Re: [HACKERS] SPI: ERROR: no snapshot has been set

2005-08-13 Thread Martijn van Oosterhout
On Sat, Aug 13, 2005 at 12:36:42PM -0400, Tom Lane wrote:
 Martijn van Oosterhout kleptog@svana.org writes:
  - Is there a way to detect if a snapshot has been started and creating
  one if there hasn't. I've seen the snapshot related functions but I
  don't feel comfortable sticking them in my type input function just to
  fix this. Is it safe?
 
 This would be a pretty bad idea IMHO.  The setting of the first snapshot
 in a transaction is a significant event, and it should not happen in
 random places like type-specific input functions.

I figured as much. In that case I've been thinking something like:

if (SerializableSnapshot == NULL)
elog( ERROR, Sorry, 'const'::mytype constructs are not 
supported in first 
 transaction, please use mytype('const'::text) );

I hope this works, I havn't gotten around to the casts yet, I hope
non-immutable casts are called late enough. Annoyingly, it breaks a
straight:

INSERT INTO table VALUES ('const','const','const',...)

Which is what GUI frontends are going to try to do... My alternative is
to cache results but that would make whether it works sporadic and
unpredictable.

 A long-term solution to this might be to do what's envisioned in this
 comment in parse_coerce.c:

snip

 I'm a bit concerned about the possible side-effects of this on existing
 behavior, though.  In particular constructs like
   'now'::timestamp
 would change behavior subtly.

It's a good idea, eventually. All it would do is change the time it's
evaluated, and in that example it doesn't change since 'now' is the
transaction start time. However, in general it is a change of
behaviour.

Thanks in advance,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgpMg7bDVLw9s.pgp
Description: PGP signature


Re: [HACKERS] SPI: ERROR: no snapshot has been set

2005-08-17 Thread Martijn van Oosterhout
For future reference, I got around this error (no snapshot has been
set) by removing the use of SPI and just using heap_open /
heap_beginscan / heap_endscan / heap_close. It's only slightly more
code but it works irrespective of the state of the backend.

Have a nice day,

On Sat, Aug 13, 2005 at 03:59:56PM +0200, Martijn van Oosterhout wrote:
 Good guess, I am inside a type input function trying to use SPI. My
 questions are:
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgpAelbGsnm5E.pgp
Description: PGP signature


[HACKERS] Typmod for user-defined types

2005-08-27 Thread Martijn van Oosterhout
[Please CC any replies, thanks]

Hi,

I've got a situation were I'd really like to be able to have a typmod
for a user-defined type. In particular, I'd like to make use of the
coerce_to_target_type()/coerce_type_typmod() chain. This only works if
you have a typmod != -1.

Even if you set the typmod in pg_type, when you create a table it's not
copied to the attribute but set back to -1.

This has been discussed before[1] and I notice Tom Lane posted a
message about this back in June last year. I was wondering if it would
be possible to allow user-defined types to declare a typmod function.

As an experiment I added arguments to GenericType but that gave
reduce/reduce conflicts. OTOH if you just add to SimpleTypename it just
works. I imagine this limits the places it can work.

Tom Lanes patch[2] looks like it may work, but would a mechanism to
allow user-defined types to have a typmod function be accepted?

Have a nice day,

[1] http://archives.postgresql.org/pgsql-hackers/2004-06/msg00923.php
[2] http://archives.postgresql.org/pgsql-hackers/2004-06/msg00932.php
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgp66IZZSTitb.pgp
Description: PGP signature


Re: [HACKERS] Typmod for user-defined types

2005-08-28 Thread Martijn van Oosterhout
AIUI, the issue is statements like:

# select numeric(6,3) '5.6';

Bison can't look far enough ahead to determine whether the numeric is a
type or a function. However, the desired grammer itself is not
ambgiuous, it's just that Bison is not smart enough to make it work. It
works now only because we hardwire the types that are allowed to have
argument.

Wouldn't a way around this be that to have grammer allow a type to look
like a function call but convert the FuncExpr node to a TypeName node
if the grammer determines it is a type after all. Something like:

SimpleTypename: type_name   { $$ = $1 }
| func_expr { $$ = MakeTypeFromFuncCall( $1 ) }

That function could then check that only constant integer arguments have been
used, and then dispatch to a type specific function to turn the arg
list into a typmod. Among other things, this would allow us to produce
better error messages for things like bit(6,3) and numeric(l) instead
of just syntax error.

I was thinking of adding to each type a typmod_in and a typmod_out
function. The first takes an array of integer in and returns an int32,
the latter does the reverse.

This is however quite a dramatic overhaul of the grammer and I'm not
sure it's actually conflict free. But numeric, float, decimal,
etc would no longer need to be matched by the lexer and you could call
the timestamp function without double quotes.

At least in preliminary tests it seems OUT as a function name would be
a casualty due to the new IO/OUT parameters:

function ( field IN OUT type )

It can't determine soon enough that OUT can't be a type in this
context. And with this change type/function names would be the same
list. I wonder if there is a way around this.

Anyway, I'll try out some more stuff.

Have a nice day,


On Sat, Aug 27, 2005 at 11:09:09AM -0400, Tom Lane wrote:
 Martijn van Oosterhout kleptog@svana.org writes:
  Tom Lanes patch[2] looks like it may work, but would a mechanism to
  allow user-defined types to have a typmod function be accepted?
  [2] http://archives.postgresql.org/pgsql-hackers/2004-06/msg00932.php
 
 Well, the question still stands: are we going to paint ourselves into
 a corner by requiring type and function names to be equivalent?
 
   regards, tom lane

-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgpXvQCsdbD2j.pgp
Description: PGP signature


[HACKERS] Simple tester for MVCC in PostgreSQL

2005-08-29 Thread Martijn van Oosterhout
[Please CC any replies, thanks]

Hi,

I saw the discussion about an tester for MVCC. Since I'd never done
anything with asyncronous queries before, I figured I'd try to write
something useful with it. The result is at:

http://svana.org/kleptog/pgsql/mvcctest.tar.gz

It's a tester that takes a testing script, runs the queries one at a
time through as many connections as you ask. It uses asyncronous
queries so it can handle queries that don't return immediatly (using 
and wait model ala UNIX). It doesn't use threads. It can assign
variables and handle expected errors and test that queries unblock at
the right time. Whether this is enough for serious testing I have no
idea.

I've included 5 selftests and 3 real tests (which about covers my
knowledge of transaction isolation levels). I hope it provides a useful
basis for a real tool.

It uses Perl and the Pg module from CPAN (included in many
distributions, it's basically a wrapper for libpq). For more details
see the README. A simple test harness type program is included. Just
typing make will set it going.

As an example I include a sample test script below (included as
tests/in/test3.test). It opens two connections and tries several
concurrent updates. In the first block, they don't conflict so
there is no delay. In the second block the second update has to wait
for the rollback, in the third it causes a serialization failure.

I will not be able to answer my email for the next few hours due to
sleep. Please be patient if it doesn't work straight away.

Have a nice day,

# This test tests simple serialisation failure
1I drop table t
1  SET default_transaction_isolation = serializable
2  SET default_transaction_isolation = serializable

1  create table t (grp text, value int4)
1  insert into t values ('a',10)
2  insert into t values ('a',20)
1  insert into t values ('b',30)
2  insert into t values ('b',40)
# These updates don't conflict, all should be fine
1  begin
2  begin
1  update t set value = value + 10 where grp = 'a'
2  update t set value = value + 10 where grp = 'b'
1  commit
2  commit
# These updates do conflict, but the second can proceed once first transaction 
aborts
1  begin
2  begin
1  update t set value = value + 10 where grp = 'a'
2 update t set value = value + 10 where grp = 'a'
1  abort
2  wait
2  commit
# These updates conflict, the second transaction aborts
1  begin
2  begin
1  update t set value = value + 10 where grp = 'a'
2e update t set value = value + 10 where grp = 'a'
1  commit
2  wait
2  abort
1  drop table t

-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgpN4EY6Be9Ij.pgp
Description: PGP signature


[HACKERS] On hardcoded type aliases and typmod for user types

2005-08-31 Thread Martijn van Oosterhout
Hi,

As part of previous discussions about typmod for user type, Tom
mentioned that you would need to make type and function names
equivalent. As it turns out, if you refactor a few rules, you can
actually make it work and manage them seperately. For this to work the
current col_name_keyword non-terminal has to be divided into few more
categories. The criterion is mostly whether it is followed by a left
parenthsis.

1. Some can be moved to unreserved words, mostly the types BIT, VARCHAR,
BIGINT etc since they don't need special rules anymore.

2. Words that have special productions (eg CONVERT, COALESCE etc),
these can still only be used as column names, not types or
(user-defined) functions.

3. Words which can be column names functions but not types. These never
appear normally with a parenthesis, so they will be interpreted as a
function if there is one. (eg SETOF, NATIONAL, etc)

4. Words that can be column names and types but not functions. These
are artifacts of the grammer due to the fact that VARYING is
unrestricted. These are BIT, NCHAR, CHAR_P and CHARACTER.

After this, you can have user-functions that take an arbitrary set of
parameters. There'll need to be a bit of code to verify the arguments.
It results in a net gain of 15 keywords for functions and about 30 for
types.

My question is, should users be able to create types schema.int4 and
schema.integer simultaneously. Currently it allows you but it's not
handled very well (\dT doesn't list both). Should this be allowed?
Should aliasing for DEC and DECIMAL - NUMERIC be done for
user-defined types?

Have a nice day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgpF8U8wTWMrX.pgp
Description: PGP signature


Re: [HACKERS] On hardcoded type aliases and typmod for user types

2005-08-31 Thread Martijn van Oosterhout
On Wed, Aug 31, 2005 at 11:11:04AM -0400, Tom Lane wrote:
 IMHO, ideally the aliasing should *only* apply to the built-in types.
 The current hack only approximates this (IIRC, the translation happens
 for any unqualified type name, independently of one's search path).
 
 One possible approach is to remove the aliasing translation from the
 grammar altogether, and add a notion of alias entries in pg_type that
 would be found through normal lookup and then replaced by the underlying
 type by parse analysis rather than by the grammar.  We could not do this
 in the existing system because of the need to handle typmods for some of
 the aliases ... but maybe it would work given generalized typmod
 support.  There are still a few special cases though, like CHARACTER
 VARYING, which seems like it more or less has to be wired into the
 grammar.

Yeah, I was thinking about alias entries. I was thinking that domains
might already do a lot of the work. But then it's not really aliasing
anymore.

 BTW, the proposed refactoring sounds messier to me than does decreeing
 type and function names equivalent ...

Actually, it's not that bad. The non-terminals relating to types
collapse into about three with only xxx VARYING and DOUBLE
PRECISION and a few others remaining. The keywords are split into
three sets like I mentioned. Then you make the rules for GenericType
look enough like the expansion of func_expr, that bison can consider
the context after before deciding it's a function or a type.

The changes are not that great. The number of rules (according to the
-v output) goes from 1610 to 1601 and states from 2810 to 2777. OTOH, a
chunk of code moves from gram.y to the adt directory I guess. The
grammar is just the beginning of the work.

Though maybe the point is that we can take the easy way and implement
the slightly more difficult if it turns out the be necessary.

Have a nice day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgpiG2H0tfrQG.pgp
Description: PGP signature


Re: [HACKERS] On hardcoded type aliases and typmod for user types

2005-08-31 Thread Martijn van Oosterhout
On Wed, Aug 31, 2005 at 02:25:54PM -0400, Tom Lane wrote:
 I still like the idea of pushing the aliasing out of the grammar,
 though.  Come to think of it, we could probably even handle the
 multiple-word stuff that way: let the grammar convert CHARACTER VARYING
 to character varying and have an alias with that name in the catalog.

Currently, a user-defined type, function, table etc is one IDENT, I
don't see any point in changing that. The standard wants character
varying and we can support that but we don't need to extend that to
user-defined types. If people really want that they can quote it.
Besides, the character stuff is somewhat special as it has the
character set stuff so will need special grammer anyway.

 One thing you'd need to look at is that format_type is aware of the
 special properties of the alias names: at present they never need to be
 schema-qualified, but this would no longer be certainly the case with
 the aliasing approach.  A possible answer is for format_type to work by
 replacing (say) INT4OID with the OID of the alias type that has the
 desired spelling, and then use the same TypeIsVisible test as is applied
 to any user type.  Another thing that is involved there is not
 double-quoting the generated names ... we don't want it to emit
 character varying but the user-type path would do that.

I was thinking actually of setting the type searching code to search
pg_catalog before the normal search_path. The types being hardwired
into the grammer essentially implied this so I thought I would avoid
surprises. 

This ofcourse would mean that all built-in types would automatically
override user-defined ones, which actually sucks if PostgreSQL keeps
including more types by default. OTOH, then types in pg_catalog never
need to be qualified, making it easy for format_type.

Now i think about it it may not be a good idea, for all its benefits.
Perhaps only doing it for multiword types. Damn special casing.
 
 Hmm... actually there's a bit of an issue here, which is that it's not
 clear whether schema qualification makes sense for the multi-word type
 names.  For instance
   pg_catalog.character varying

It doesn't work. The current grammer, even now, treats anything schema
qualified as non-special. You can't schema qualify char(4) even if you
want to. Incidently, these typmod changes for user types would make
this work as a side-effect.

Have a nice day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgprnwknlHPj7.pgp
Description: PGP signature


Re: [HACKERS] On hardcoded type aliases and typmod for user types

2005-09-01 Thread Martijn van Oosterhout
On Wed, Aug 31, 2005 at 05:14:13PM -0400, Tom Lane wrote:
 That strikes me as an unnecessary reduction in flexibility.  As long as
 we make the hardwired type names translate to qualified names (same as
 they do now) we don't have to assume any such thing.

Ack, there's fortunatly only a handful of those.

 The point about character sets is a bit distressing; here we are
 designing a new general-purpose mechanism and we can already see
 cases it doesn't handle.  Can we fix that?

Err, well. My thought was a certain group of type-suffix options would
be permitted (only zero or one at a time), for example:

   WITH TIME ZONE
   WITHOUT TIME ZONE
   CHARACTER SET xxx

And have the grammer accept these after any type. For example, the type
NUMERIC WITH TIME ZONE would be syntactically valid but the code would
then reject it. You have a issue then because the typmod function
should then be able to return a completely different type because the
system looked up timestamp and now the function determines that with
that option, it should actually be timestamptz.

As for the specific mechanism, well, my options were (in the TYPE
declaration statement:

   TYPMODFUNC = function( intarray [, sometype] ) RETURNS int32 or intarray

This restricts the arguments between the brackets to integers, is this
reasonable? The sometype would be something to handle the suffix
options. (Text pair? option,value). Returning an intarray if a new type
is allowed.

   TYPMODFUNC = function( recordtype [, sometype ] ) RETURNS int32 or intarray

The record type could then indicate what's supported, except you can't
pass a variable number of arguments (for NUMERIC). How about fill up
from the front, leave NULLs for all the unfilled ones. The STRICT flag
could tell if all fields need to be filled (ugh).

   TYPMODFUNC = function( internal [, sometype ] ) RETURNS int32 or intarray

Simply pass the (Node*) from the parser and let the function sort it
out itself. Except now they have to be written in C. Is this
unreasonable, it's called fairly early on, all the issues with no valid
snapshot apply here and you can't defer the evaluation till later.

I'm not sure how to choose, they all handle the current situation fine
but what do we want to allow users to do in the future? Is the SQL
standard likely to come up with SOMETYPE(ident) as a declaration, in
which case we need the second or third options. Ident can be converted
to a constant string for these purposes.

And then there's output to consider, currently timestamp etc have
special cases. But if you're going to allow CHARACTER SET xxx to
apply to any type, you need a way to reconstruct the values for output.
Requireing the user the provide an inverse function is one (possibly
unreliable) way. Storing the arguments directly is another. And is one
int32 typmod sufficient? This character set per column has been talked
about for a while, but where was the information going to be stored?

There's several issues to be sorted out yet, I fear.

Have a nice day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgplhXwyrOAOR.pgp
Description: PGP signature


Re: [HACKERS] On hardcoded type aliases and typmod for user types

2005-09-01 Thread Martijn van Oosterhout
On Thu, Sep 01, 2005 at 11:18:04AM +0200, Dennis Bjorklund wrote:
 String types have 3 modifiers, the length, the charset and the collation. 
 The syntax of these are defined by the standard so at least that syntax 
 ought to be allowed (even if there are more work to actually do anything 
 with charset and collation info).

From a quick browse in google it looks like:

   CHAR(4) COLLATE xxx CHARACTER SET yyy

is valid syntax. So I guess that becomes 0 or more modifiers. And a
single int32 is not going to cut it.

More fields (typmod2, typmod3)? Fields explicitly for this purpose
(typcollate and typcharset), array of int32?

Which would make the second argument to the typmod function ARRAY OF
(textpair) or some such. If the function doesn't accept that then this
is a quick indicator that no options are allowed.

Quick thing, should 'mytype' and 'mytype()' be considered the same and
should they default to typmod -1? Currently '()' is not even accepted.

Have a nice day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgpKBsVJBWalY.pgp
Description: PGP signature


Re: [HACKERS] On hardcoded type aliases and typmod for user types

2005-09-01 Thread Martijn van Oosterhout
On Thu, Sep 01, 2005 at 11:12:26AM +0300, Hannu Krosing wrote:
 Maybe make the last one WITH CHARACTER SET xxx and promote WITH to a
 real keyword.
 
 It seems a good idea to have WITH as a real keyword anyway, as at least
 ANSI/ISO syntax for recursive queries seem to require it too.

Sorry, CHARACTER SET is defined by SQL standard. I don't understand
what it is there for though, I thought the point of UNICODE/UTF-8 was
to get rid of all this crap. I also can't find the bit that explains
what should happen if two strings of different character sets are
concatinated. The only thing I can think this useful for is default
input/output charset, overriding client_encoding, and internally
everything is still UNICODE.

The COLLATE stuff is neat, if we can get it work. Maybe CHARSET is a
roundabout way to specify the COLLATE order?

Incidently, I just downloaded the SQL99 spec and am slightly confused
by some of the things they'd added. Am I the only one?
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgpExRZavzpkf.pgp
Description: PGP signature


Re: [HACKERS] On hardcoded type aliases and typmod for user types

2005-09-01 Thread Martijn van Oosterhout
On Thu, Sep 01, 2005 at 08:50:27AM -0400, Tom Lane wrote:
 Martijn van Oosterhout kleptog@svana.org writes:
  Simply pass the (Node*) from the parser and let the function sort it
  out itself. Except now they have to be written in C. Is this
  unreasonable,
 
 Nope.  You're not going to be writing any interesting datatypes without
 using C, anyway.  I'd go with this one to minimize flexibility problems.

Ack

 I'd be inclined to stick with the convention that internally typmod is
 a single int32; that is really wired into way too many APIs to consider
 changing.  varchar could do something like using 24 bits for the length
 and 8 bits for an encoded indication of the charset.

With the unfortunate effect that strings are limited to 16Mb instead of
1Gb. Not sure if people will be happy with that one. For my locale
experiments I used my taggedtypes module to embed the locale into the
data itself, I imagine something similar could be used.

Have a nice day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgpdGAU0Ug6IH.pgp
Description: PGP signature


[HACKERS] Using multi-locale support in glibc

2005-09-01 Thread Martijn van Oosterhout
Browsing the glibc stuff for locales I noticed that glibc does actually
allow you to specify the collation order to strcoll and friends. The
feature is however marked with:

   Attention: all these functions are *not* standardized in any form.
   This is a proof-of-concept implementation.

They do however work fine. I used my taggedtypes module to create a
type that binds the collation order to the text strings and the results
can be seen below.

1. Is something supported by glibc usable for us (re portability to
non-glibc platforms)?

2. Should we be trying to use an interface that's specifically marked
as unstable?

3. What's the plan to support multiple collate orders? There was a
message about it last year but I don't see much progress.

4. It makes some things more difficult. For example, my database is
UNICODE and until I specified a UTF8 locale it didn't come out right.
AFAIK the only easy way to determine if something is UTF8 compatable is
to use locale -k charmap. The C interface is hidden. It should be
possible to compile a list of locales and allow only ones matching the
database. Or automatically convert the strings, the conversion
functions exist.

5. Maybe we should evaluate the interface and give feedback to the
glibc developers to see if it can be made more stable.

If you want to have a look to see what's available, use:
rgrep -3 locale_t /usr/include/ |less

Have a nice day,

PS. The code to test this can be found at:
http://svana.org/kleptog/pgsql/taggedtypes.html

--- TEST OUTPUT ---

test=# select strings from taggedtypes.locale_test order by locale_text( 
strings, 'C' );
 strings 
-
 Test2
 Tést1
 Tëst1
 test1
 tèst2
(5 rows)

test=# select strings from taggedtypes.locale_test order by locale_text( 
strings, 'en_US' );
 strings 
-
 Tëst1
 Tést1
 tèst2
 test1
 Test2
(5 rows)

test=# select strings from taggedtypes.locale_test order by locale_text( 
strings, 'nl_NL' );
ERROR:  Locale 'nl_NL' not supported by library
test=# select strings from taggedtypes.locale_test order by locale_text( 
strings, 'en_AU.UTF-8' );
 strings 
-
 test1
 Tést1
 Tëst1
 Test2
 tèst2
(5 rows)
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgpCaa0baPW34.pgp
Description: PGP signature


Re: [HACKERS] Using multi-locale support in glibc

2005-09-01 Thread Martijn van Oosterhout
On Thu, Sep 01, 2005 at 01:46:00PM -0400, Tom Lane wrote:
 Martijn van Oosterhout kleptog@svana.org writes:
  1. Is something supported by glibc usable for us (re portability to
  non-glibc platforms)?
 
 Nope.  Sorry.

Do we have some platforms that don't have any multi-language support? I
mean, we don't have a complete thread library but a wrapper around the
ones used on the platform. Couldn't we make a similar wrapper that used
glibc if it was available, windows native if it's available, etc...

That way we conform to the platform rather than a version of the
unicode collating set that postgresql happens to ship with it.

For example, Windows doesn't use standard Unicode sorting rules, do we
care if people come complaining that postgresql sorts different from
their app?
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgprsJeImdmfb.pgp
Description: PGP signature


[HACKERS] Proof of concept COLLATE support with patch

2005-09-02 Thread Martijn van Oosterhout
Supports any glibc platform and possibly Win32.

Adds:
  SELECT ... ORDER BY expr COLLATE 'locale'
  CREATE INDEX locale_index ON table(expr COLLATE 'locale')
  Index scan used when COLLATE order permits

This is just a proof of concept patch. I didn't send it to -patches
because as Tom pointed out, there's no hope of it getting in due to
platform dependant behaviour.

This patch does not use setlocale and is completely orthoganal to any
locale support already in the backend.

As it turns out, meaningful locale support only needs a handful of
support functions to work. These are listed at the bottom. My patch
only uses the first two, but the third will be needed at some stage.
The use of the last one depends on how the backend ends up support
locales. Both glibc and wine32 have locale sensetive versions of many
functions including:

toupper_l, tolower_l, strfmon_l, strtoul_l, strtof_l, strftime_l, is*_l

A windows function list is at:
http://msdn2.microsoft.com/library/wyzd2bce(en-us,vs.80).aspx

Patch available here:
http://svana.org/kleptog/pgsql/collate1.patch

Implementation notes follow and table of functions is at the bottom.

I hope this helps whenever someone gets around to full COLLATE support.

Have a nice day,

Notes:
   * It works by replacing (expr COLLATE 'locale') with
   pg_strxfrm(expr, pg_findlocale(locale))
 in the parsetree.

 pg_findlocale returns an opaque pointer to the locale. It is
 STRICT IMMUTABLE and is optimised away in the final query.

 pg_strxfrm takes the string and the locale and returns a bytea. 
 bytea comparison uses memcmp so is safe from other locale effects
 in the backend.

   * Use of COLLATE for an index will probably double the diskspace
 required for that index due to the strxfrm.

   * I had to add the functions to pg_proc.h because CREATE FUNCTION
 couldn't find them. So they have OIDs I made up. You may need to
 initdb, I'm not sure.

 You can compile pg_xlocale.c as an shared object and load them
 that way too if you want to avoid the initdb.

   * Internally they are defined as taking and returning internal.
 CREATE FUNCTION doesn't like that so specify opaque or oid
 instead. The declarations are:

  create function pg_findlocale(text) returns oid as 'pg_findlocale' language 
internal strict immutable;
  create function pg_strxfrm(text,oid) returns bytea as 'pg_strxfrm' language 
internal strict immutable;

   * The clause ORDER BY 1 COLLATE 'en_AU' breaks, it treats the 1 like
 a constant. I couldn't quickly work out how to reference the
 columns the right way. Long term that code should be in the
 sorting code anyway.

   * The locale needs to be in quotes, otherwise the parser converts it
 to lower-case. Locale names are case-sensetive on many systems.

   * There is a text function strcoll_l for testing collation:

  create function pg_strcoll_l(text,text,text) returns int4 as 'pg_strcoll_l' 
language internal strict immutable;

   * Yes this is the easy way out, implementing the inheritence of the
 COLLATE attribute will be much more invasive. This gives most
 people what they want though.

   * Although these functions are documented on Windows, they are not
 for glibc, so it is an unstable insterface.

Function Needed glibc Win32
-
Function returing opaquenewlocale _create_locale
pointer to locale data

strxfrm with locale parameter   strxfrm_l _strxfrm_l

Method finding encoding for nl_langinfo_l ???
locale

strcoll with locale parameter   strcoll_l _strcoll_l

-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgpmj3S2BGvSy.pgp
Description: PGP signature


Re: [HACKERS] Proof of concept COLLATE support with patch

2005-09-02 Thread Martijn van Oosterhout
On Fri, Sep 02, 2005 at 03:04:20PM +0200, Martijn van Oosterhout wrote:
 Supports any glibc platform and possibly Win32.

MacOS X [1] supports this also apparently. And for glibc it appears to
have been accepted as part of the API since 2.3.2 and formally accepted
into LSB3.0. Win32 claims to have supported this since '98.

But even though the MacOS X manpage says BSD Library Functions at the
top of the page, neither FreeBSD or OpenBSD doesn't appear to have it
at all. Not really a lot of chance that we could pull portions of the
Darwin libc into PostgreSQL, huh?

Maybe the easiest thing would be to download the libc locale support of
one of the BSDs, remove the global variable and use that...

[1] http://www.hmug.org/man/3/newlocale.php

Have a nice day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgpBsEixB7sps.pgp
Description: PGP signature


Re: [HACKERS] Proof of concept COLLATE support with patch

2005-09-02 Thread Martijn van Oosterhout
On Fri, Sep 02, 2005 at 12:44:00PM -0400, Tom Lane wrote:
 
 Hmm, the more general page seems to be 
 
 http://www.hmug.org/man/3/xlocale.php
 
 This seems to be pretty much exactly what we want, at least API-wise.
 Now, if we can find an implementation of this with a BSD license ;-) ...

Yes it is, it's exactly the same interface as glibc. Windows has them
all with an underscore prefix.

 [ I don't recall at the moment whether Apple publishes all of Darwin
 under a straight BSD license, but that would surely be a good place to
 look first. ]

libc is listed as APSL licence, whatever that means. Something with
that many clauses can't be BSD compatable.

What I wonder is how come Apple implemented all this in their version
yet none of the BSDs got around to it.

I've looked around for Citrus, it appears that NetBSD contains the
latest version and while there's a lot of stuff for LC_CTYPE and charset
conversion, LC_COLLATE didn't appear to be high on their priorities.

I especially liked these fragments from the OpenBSD and NetBSD CVS
repositories. Tom, you've comvinced me, relying on the platform is
silly. We have platforms that don't support LC_COLLATE in one locale,
let alone multiple. FreeBSD thankfully does support it.

http://cvsweb.netbsd.org/bsdweb.cgi/src/lib/libc/string/strcoll.c?rev=HEAD
http://www.openbsd.org/cgi-bin/cvsweb/src/lib/libc/string/strcoll.c?rev=HEAD
--- snip ---
/*
 * Compare strings according to LC_COLLATE category of current locale.
 */
int
strcoll(s1, s2)
const char *s1, *s2;
{

_DIAGASSERT(s1 != NULL);
_DIAGASSERT(s2 != NULL);

/* LC_COLLATE is unimplemented, hence always C */
return (strcmp(s1, s2));
}


-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgp1mXXdNMgaL.pgp
Description: PGP signature


Locale implementation questions (was: [HACKERS] Proof of concept COLLATE support with patch)

2005-09-03 Thread Martijn van Oosterhout
On Fri, Sep 02, 2005 at 11:42:21AM -0400, Tom Lane wrote:
 The objection is fundamentally that a platform-specific implementation
 cannot be our long-term goal, and so expending effort on creating one
 seems like a diversion.  If there were a plan put forward showing how
 this is just a useful way-station, and we could see how we'd later get
 rid of the glibc dependency without throwing away the work already done,
 then it would be a different story.

Well, my patch showed that useful locale work can be acheived with
precisely two functions: newlocale and strxfrm_l.

I'm going to talk about two things: one, the code from Apple. Two, how
we present locale support to users.
---
Now, it would be really nice to take Apple's implementation in Darwin
and use that. What I don't understand is the licence of the code in
Darwin. My interpretation is that stuff in:

http://darwinsource.opendarwin.org/10.4.2/Libc-391/locale/

is Apple stuff under APSL, useless to us. And that stuff in:

http://darwinsource.opendarwin.org/10.4.2/Libc-391/locale/FreeBSD/

are just patches to FreeBSD and this under the normal BSD license (no
big header claiming the licence change). The good news is that the
majority of what we need is in patch form. The bad news is that the hub
of the good stuff (newlocale, duplocale, freelocale) is under a big fat
APSL licence.

Does anyone know if this code can be used at all by BSD projects or did
they blanket relicence everything?
---
Now, I want to bring up some points relating to including a locale
library in PostgreSQL. Given that none of the BSDs seem really
interested in fixing the issue we'll have to do it ourselves (I don't
see anyone else doing it). We can save ourselves effort by basing it on
FreeBSDs locale code, because then we can use their datafiles, which we
*definitly* don't want to maintain ourselves. Now:

1. FreeBSDs locale list is short, some 48 compared with glibc's 217.
Hopefully Apple can expand on that in a way we can use. But given the
difference we should probably give people a way of falling back to the
system libraries in case there's a locale we don't support.

On the other hand, lots of locales are similar so maybe people can find
ones close enough to work. No, glibc and FreeBSD use different file
formats, so you can't copy them.

Do we want this locale data just for collation, or do we want to be
able to use it for formatting monetary amounts too? This is even more
info to store. Lots of languages use ISO/IEC 14651 for order.

2. Locale data needs to be combined with a charset and compiled to work
with the library. PostgreSQL supports at least 15 charsets but we don't
want to ship compiled versions of all of these (Debian learnt that the
hard way). So, how do we generate the files people need.

  a. Auto-compile on demand. First time a locale is referenced spawn
the compiler to create the locale, then continue. (Ugh)
  b. Add a CREATE LOCALE english AS 'en_US' WITH CHARSET 'utf8'. Then
require the COLLATE clause to refer to this identifier. This has some
appeal, seperating the system names from the PostgreSQL names. It also
gives some info regarding charsets.
  c. Should users be allowed to define new locales?
  d. Should admins be required to create the external files using a
program, say pg_createlocale.

Remember, if you use a latin1 locale to sort utf8 you'll get the wrong
result, so we want to avoid that.

3. Compiled locale files are large. One UTF-8 locale datafile can
exceed a megabyte. Do we want the option of disabling it for small
systems?

4. Do we want the option of running system locale in parallel with the
internal ones?

5. I think we're going to have to deal with the very real possibility
that our locale database will not be as good as some of the system
provided ones. The question is how. This is quite unlike timezones
which are quite standardized and rarely change. That database is quite
well maintained.

Would people object to a configure option that selected:
  --with-locales=internal (use pg database)
  --with-locales=system   (use system database for win32, glibc or MacOS X)
  --with-locales=none (what we support now, which is neither)

I don't think it will be much of an issue to support this, all the
functions take the same parameters and have almost the same names.

6. Locales for SQL_ASCII. Seems to me you have two options, either
reject COLLATE altogether unless they specify a charset, or don't care
and let the user shoot themselves in the foot if they wish...

BTW, this MacOS locale supports seems to be new for 10.4.2 according to
the CVS log info, can anyone confirm this?

Anyway, I hope this post didn't bore too much. Locale support has been
one of those things that has bugged me for a long time and it would be
nice if there could be some real movement.

Have a nice weekend,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent

Re: Locale implementation questions (was: [HACKERS] Proof of concept COLLATE support with patch)

2005-09-04 Thread Martijn van Oosterhout
On Sat, Sep 03, 2005 at 05:44:50PM -0400, Greg Stark wrote:
 [...] Nor is it
 simpler for sysadmins to have to maintain an entirely separate set of locales
 independently from the system locales.

Indeed, I was already coming up with mechanisms to determine what
locales the system uses and try to autogenerate them. I agree though,
it's not useful for systems that already have complete locale support.
Why add to the burden?

Anyway, my reading of the specs says that we must support the syntax.
It doesn't say we need to support any orderings other than the default
(ie what we do now).

 If you really are unhappy enough with OS setlocale implementations to want to
 try to do this then it would be more helpful to do it outside of Postgres.
 Package up the Apple setlocale library as a separate package that anyone can
 install on Solaris, BSD, Linux or whatever. Then Postgres can just say it
 works fine with your OS library but your OS library might be very slow. Here's
 a third-party library that you can install that is fast and may relieve any
 problems you have with collation performance.

That's why I asked about the patches and files that Apple wrote. What
are the licence restrictions? Would we be able to download the, what,
20 files and distribute it as a library. Being APSL we couldn't include
it in the tarball, but it could be a pgfoundry project or something.

If somebody knows a reason why this could not be done, speak up now because
my reading of the APSL licence tells me it's fine.

 But I think that's getting ahead of things. Until Postgres even supports
 collations using the OS libraries you won't even know if that's even
 necessary.

Well, I added COLLATE support for ORDER BY and CREATE INDEX and it
worked in under 200 lines. I'm thinking ahead and I don't think the
COLLATE rules are that hard. Implementing them seems a bit fiddly. It
may be easiest to consider COLLATE a non-associative operator.

I'm still unsure if I should turn the string comparison operators into
three-argument functions.

Anyway, I'll look into the library issue first.
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgpY0njaqfLIH.pgp
Description: PGP signature


Re: [HACKERS] Locale implementation questions

2005-09-04 Thread Martijn van Oosterhout
On Sun, Sep 04, 2005 at 10:25:36PM +0900, Tatsuo Ishii wrote:
  3. Compiled locale files are large. One UTF-8 locale datafile can
  exceed a megabyte. Do we want the option of disabling it for small
  systems?
 
 To avoid the problem, you could dynmically load the compiled
 tables. The charset conversion tables are handled similar way.

That's not the point, ofcourse they are loaded dynamically. The
question is, when do we create the files in the first place. There are
48*15 = 750 combinations which would amount to tens of megabytes of
essentially useless data. *When* you create the files is an important
question. Compile time is out.

Charset conversion is completely different, there just arn't that many
combinations.

 Also I think it's important to allow user defined collate data. To
 implement the CREATE COLLATE syntax, we need to have that capability
 anyway.

Most OS's allow you to create collate data yourself anyway, why do we
need to implement this too?

 To be honest, I don't understand why we have to rely on (often broken)
 system locales. I don't think building our own locale data is too
 hard, and once we make up it, the maintenace cost will be very small
 since it should not be changed regularly. Moreover we could enjoy the
 benefit that PostgreSQL handles collations in a corret manner on any
 platform which PostgreSQL supports.

You say building our own locale data is not hard. I disagree, it's a
waste of time we can do without. Unless you know the language yourself
you cannot check changes made by anybody else. If there's an error in
locale ordering, take it up with your OS distributor.

I also think we open ourselves to questions like:

1. My locale is supported by the system but not by PostgreSQL, why?
2. My locale was supported last release but not this one, why?
3. Why does PostgreSQL sort differently from 'sort' or any other app on
my system?

 Right. We Japanese (and probably Chinese too) have been bugged by the
 broken mutibyte locales for long time. Using C locale help us to a
 certain extent, but for Unicode we need correct locale data, othewise
 the sorted data will be completely chaos.

Ok, is glibc still wrong or are they just implementing the unicode
standard and that's what's wrong.

All I'm saying is that we need to allow use of system locales until our
native locale support is mature. In the end something like ICU
(http://icu.sourceforge.net/) will end up obsoleting us. Nobody (in
free-software anyway) uses it yet, but eventually it may be viable to
require that to allow system independant locales.
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgp43wJ8YwMLK.pgp
Description: PGP signature


[HACKERS] Install Darwin's locale library on your system :)

2005-09-05 Thread Martijn van Oosterhout
Well, it was pointed out the other day that the Darwin C library
supports the non-standard extensions to the POSIX locale interface and
that this might be ported to other systems so PostgreSQL could use it.

So, I have written a few scripts which download the libc and locale
library from darwinsource, shuffle some files around and build the
result into a library called libdummylocale.so. It basically completely
replaces your locale support on whatever system you use it on.

It's all under the APSL, though some parts may be BSD licenced.

Let me say right now, the locale support here sucks, no two ways about
it. It doesn't support a single UTF-8 locale. Oh, it lets you specify
them, but when you ask for the CHARSET it still says US-ASCII. It does
support a number of other different charsets. (Not for collation
though).

So my challenge to those people who think maintaining a locale library
is easy: make *one* locale in FreeBSD (or Darwin or this lib) support
full UTF-8 collation in whatever locale and/or charset you choose. It's
all downhill from there.

While it builds simple programs, I don't think it's totally safe. You'd
need to rename the headers at least. And building on Darwin will
probably blow up due to the way it plays fast and loose with Darwin
specific #defines. But it's a beginning if anyone is interested. It
builds in my glibc system.

I'm going to drop the idea of making a locale library, there's just
nothing good enough. glibc is the only thing that comes close. From here
on I'm going to work on COLLATE for systems that support xlocale, with
an eye on ICU if/when it becomes standard enough.

Download: http://svana.org/kleptog/pgsql/dummylocale.tar.gz

Have a nice day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgp5fPUhr4aFM.pgp
Description: PGP signature


[HACKERS] Mysteriously lost values in nodes

2005-09-06 Thread Martijn van Oosterhout
[Please CC any replies, thanks]

I added a field to each of Var, Const, FuncExpr and OpExpr which is set
during parse_expr. But somewhere between the parsing and execution the
values of these fields get reset back to NULL. But only for FuncExpr
and OpExpr, for Var and Const it all works as expected.

I've traced with the debugger and confirmed that the field is set but
that it's copied somewhere before execution and that copy didn't copy
this field. The copyFuncExpr worked, it's just that another place did a
copy some other way.

grep reveals several places where new nodes are created but rather than
just changing them all, is there a particular phase where these changes
are made that I should be looking at?

Thanks in advance,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgpSmnU6pclcN.pgp
Description: PGP signature


Re: [HACKERS] Simple tester for MVCC in PostgreSQL

2005-09-06 Thread Martijn van Oosterhout
On Tue, Sep 06, 2005 at 03:51:41PM +, Matt Miller wrote:
 On Tue, 2005-08-30 at 00:56 +0200, Martijn van Oosterhout wrote:
  http://svana.org/kleptog/pgsql/mvcctest.tar.gz
 
 I've started using it in some simple cases and it seems to be a good
 tool.  The feature set looks to me to be a pretty solid core on which to
 build.

Very nice. I too think the base is sufficient for quite complicated
tests. I actually wrote a script which tested all pairs of locks to
ensure they blocked exactly as the documentation said they should. And
it passed.

Thanks for the feedback,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgpLnhVggo4be.pgp
Description: PGP signature


Re: [HACKERS] Mysteriously lost values in nodes

2005-09-06 Thread Martijn van Oosterhout
On Tue, Sep 06, 2005 at 01:51:25PM -0400, Tom Lane wrote:
 Grepping for makeNode(OpExpr) might help you.  Offhand I'd finger
 eval_const_expressions as the likely culprit.  clauses.c has some other
 code you'd better look at too.

Yeah, eval_const_expressions was the culprit in this case, though I
think operators might have some more.

 Personally, when I want to add a field to a node, I grep for every
 reference to one or two of the existing fields to make sure I've found
 all the places I need to touch.

So there's no shortcut, I'll remember that :)

Thanks for the help,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgpULJaurnt0f.pgp
Description: PGP signature


[HACKERS] Suggestion to simplify installation of external modules

2005-09-08 Thread Martijn van Oosterhout
There's been some discussion about loading of external modules and
getting the right function definitions. I was wondering if it would be
an idea to have the install scripts inside the module itself. Create a
command called:

INSTALL word | 'full path'

Would search for $libdir/{word}.so (or full path if specified) or
whatever is appropriate for the platform. The module would then be
queried for the installation script.

I made a little script sql2install.pl [1] which takes an SQL file and
turns it into a C file that exports a single function,
__PG_INSTALL_FUNCTION. This returns a list of strings which are the
commands in the SQL file. Just link this into the final module.

While this does simplify installation, there are other benefits:

- There could be a maintained list of installed modules, for
frontends.

- If the process created dependancies between the module and the
functions created, pg_dump could avoid outputting those functions and
types and instead emit INSTALL blah in the dump. This makes upgrades
much easier because the definitions are no longer in the dump.

- Dumps become more portable across architectures, because the library
name is not stored directly.

- Saner error messages.

I guess in the specific case of languages, the proposed table would be
redundant as the CREATE LANGUAGE definition would neither be hardcoded
in the backend, nor in the dump, but in the module that implements it.

Note, it would be just as easy to return a structure with author name,
copyright info, version, CATVERSION, etc. Ideally, the module author
would provide a properly formatted file and the pgxs scripts would take
care of the conversion to C and linking.

Note: I also though of UNINSTALL but that would seem to be a great
shoot-in-foot capability.

Thoughts?

[1] http://svana.org/kleptog/pgsql/sql2install.pl
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgpXT06S5bKJY.pgp
Description: PGP signature


Re: [HACKERS] Case insensitive indexing in posgtres?

2005-09-09 Thread Martijn van Oosterhout
On Fri, Sep 09, 2005 at 04:00:57PM +0530, sandeep satpal wrote:
 
 Dear ,
 
 One of the difference in mysql and postgresql is case insensitiveness.
 I want to midify the code of posgresql so that it can support case 
 insensitiveness in indexing and in join operation.

Use citext:

http://gborg.postgresql.org/project/citext/projdisplay.php
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgpEddh7fOSvK.pgp
Description: PGP signature


Re: [HACKERS] random system table corruption ...

2005-09-11 Thread Martijn van Oosterhout
On Sun, Sep 11, 2005 at 01:12:34PM +0200, Hans-Jürgen Schönig wrote:
 in the past we have faced a couple of problems with corrupted system 
 tables. this seems to be a version independent problem which occurs on 
 hackers' from time to time.
 i have checked a broken file and i have seen that the corrupted page has 
 actually been zeroed out.

Near as I can tell, the only times pages are zeroed out is if
zero_damaged_pages is set (destroying the evidence) or during WAL
recovery.

 my question is: are there any options to implement something which makes 
 system tables more robust? the problem is: the described error happens 
 only once i an while and cannot be reproduced. maybe there is a way to 
 add some more sanity checks before the page is actually written.

Well, the most common causes are dodgy memory. Other than that I guess
you could arrange for bgwriter to check the pages it is writing. I
imagine it already does check the header, checking the data requires
knowledge about the actual table and attributes. And about the only
thing that says I'm broken is a varlena value with a long value.

As they say, the only thing sure would be to have a backup. the only
thing I can imagine being really useful would be a restore mode where
you feed it the schema so it can reconstruct the pg_class and
pg_attribute just enough for you to dump it to reconstruct
everything...

You know, VACUUM FREEZE BACKUP on pg_catalog, physically copy the
datafiles and offer the option to blat your catalog with an old one...
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgp7YJB3n07fv.pgp
Description: PGP signature


Re: [HACKERS] -fPIC

2005-09-12 Thread Martijn van Oosterhout
On Sun, Sep 11, 2005 at 10:32:51PM -0400, Tom Lane wrote:
 Greg Stark [EMAIL PROTECTED] writes:
  ... mixing -fpic and -fPIC libraries is a problem.
 
 Is it?  I would think having two options would be essentially unworkable
 if so.

The thing is, on i386 it makes no difference, it's only on some
archtechtures where it matters. And it has to do with both the size of
the symbol table and the size of the code.

Given that you don't know what you need to use until you compile it, if
people are compiling all their stuff with -fPIC you can at least be
sure that it won't break on other architectures.

The new gcc visibility stuff gives you way of shrinking the symbol
table and improving performance. There is a performance difference
between -fpic and -fPIC, whether it's big enough to care about...

You can shrink the symbol table with --version-script in LD, you
provide a script like:

 {
  global:
pg_finfo_*
other exported symbols
  local: *
 }

Whether it's enough... For people who want to know the gory details,
read this (by Ulrich Drepper).

http://people.redhat.com/drepper/dsohowto.pdf

-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgpRLUi4GonQ4.pgp
Description: PGP signature


Re: [HACKERS] Hard drive failure leads to corrupt db

2005-09-13 Thread Martijn van Oosterhout
Your problem is that your database was initialised with locale
'en_US.ISO8859-1' but your system no longer recognises it. You need to
create the locale somehow. On Linux it's /etc/locale.gen but you should
probably search the locale manpage for how to do it on Solaris.

Changing the locale requires a pg_dump and restore...

Hope this helps,

On Tue, Sep 13, 2005 at 08:49:23AM -0400, Brusser, Michael wrote:
 Just occurred to me: perhaps we don't have a database corruption,
 instead after replacement of the boot drive the locale on the host
 changed from
 en_US.ISO8859-1 to 'C'
 
 Still I am not sure what to do. Is changing the locale back to
 en_US.ISO8859-1
 the right thing to do now?
 
 Mike.
 
 -Original Message-
 
 Our customer reported a problem resulting from the hard drive failure.
 Database server would not start, generating this message:
   PANIC: The database cluster was initialized with LC_COLLATE
 'en_US.ISO8859-1',
   which is not recognized by setlocale().
   It looks like you need to initdb.
 
 They are running v.7.3.2 on Solaris, where all locale parameters are set
 to C.
 Is there anything we can do to restore the database without data loss?
 
 Thank you,
 Michael.
 
 
 ---(end of broadcast)---
 TIP 2: Don't 'kill -9' the postmaster

-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgp2pBDrGCfWu.pgp
Description: PGP signature


Re: [HACKERS] Spinlocks, yet again: analysis and proposed patches

2005-09-14 Thread Martijn van Oosterhout
On Tue, Sep 13, 2005 at 08:14:47PM -0400, Stephen Frost wrote:
 I suppose another option would be to provide seperate packages...  Could
 this be done as a shared library so it's more 'plug-and-play' to switch
 between the two?  I dunno, just trying to think about how to deal with
 this without making it suck terribly bad for me (as a Debian user and
 maintainer).

Note that the Linux kernel has played with moving spinlock code out of
line. Due to the effect of having so many, it ended up that the memory
saved by moving the code out of line actually benefitted overall. An
unconditional call to a function can't be that expensive, surely.

However, it would *have* to be in the same compile unit, no shared
libraries. ELF imposes some overhead for calls in different compile
units, using function pointers won't save you (see Ulrich Drepper
paper on it).

However, to make it flexible you would need a pointer to a function
pointer. Fortunatly these variables won't change often so the function
pointer and the function itself should be in the cache if used often
enough.

Finally, the kernel solves the problem by saying, if you compile for
uniprocessor, optimize the code away, since a lot of the issues don't
apply. My main concern is how you even detect the number of processors
in a portable way...

Have a nice day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgpLWptHgoZDN.pgp
Description: PGP signature


Re: [HACKERS] Constraint Type Coercion issue?

2005-09-14 Thread Martijn van Oosterhout
On Wed, Sep 14, 2005 at 02:23:29PM -0400, Tom Lane wrote:
 I've been thinking about this off and on, and would like to solve it
 in the 8.2 time frame, but it's not happening for 8.1.  At a minimum
 it'll require some significant changes in our concept of what an
 operator class is.  The half-jelled ideas I have involve inventing
[snip]

How much discussion has there been on this? I've been working my way
through COLLATE support and indexes and realised that what I really
want is to allow the comparison functions in operator classes to be
three argument functions. The two things to compare and the collate
order. A descending index is really just another collate order, albeit
one easily imposed from the outside.

Although numbers tend not to have many interesting collate orders,
complex numbers do, as do obviously strings. To some extent, collate
implies a sort of parameterised operator class...

Definitly 8.2 stuff, and it's not simple stuff either...

-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgpVP4Exb2Tl2.pgp
Description: PGP signature


Re: [HACKERS] Constraint Type Coercion issue?

2005-09-14 Thread Martijn van Oosterhout
On Wed, Sep 14, 2005 at 05:28:42PM -0400, Tom Lane wrote:
  To some extent, collate
  implies a sort of parameterised operator class...
 
 Hmm.  But an index couldn't support more than one collation order
 AFAICS.  It'd probably make more sense to create operators and an
 operator class for each collation you want to support; the mapping
 to a call of a common support function would be embedded inside the
 operator definition.  Otherwise we have to pass around an additional
 parameter through an awful lot of places...

Well yes, but given the number of possible locales, creating one class
for each seems excessive. And each class would have to create 5
operators (with underlying functions) and 1 comparitor function. Unless
you could shortcut something like:

CREATE OPERATOR CLASS ...
   OPERATOR 1 (text,text,'en_US')
...
   FUNCTION 1 mycompare(text,text,'en_US')
...
   COLLATE en_us;

Otherwise you end up with lots of functions which have be created on
the fly as the user decides what collate orders he wants. Invoking SQL
functions in the btree index create cycle doesn't seem efficient.

You would have to come up with new names for the operators each time
because the argument types are going to be the same. Although I guess
you could call it OPERATOR(en_us), it's not like people are going to
use it directly.

Maybe it should be that we allow users to specify three argument
operators, and have an extra entry in the operator class which defines
the extra argument to pass.

It's not easy, like you say, there are a lot of places where an extra
argument would need to be passed...

Have a nice day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgpMc8fzZLktq.pgp
Description: PGP signature


Re: [HACKERS] Constraint Type Coercion issue?

2005-09-15 Thread Martijn van Oosterhout
On Wed, Sep 14, 2005 at 10:42:36PM -0400, Tom Lane wrote:
 The thing that's still fairly unclear to me is whether the collation
 information is attached to the operators/functions or to the data.
 I recall there's been some discussion of sticking collation IDs into
 individual text Datums, which is a completely different path than what
 you are positing above.  Does the SQL spec mandate one or the other of
 these approaches?  If it does, do we want to believe it?  (The more I
 read of SQL2003, the less impressed I get...)

The standard doesn't care I think. My reading is that the COLLATE
status is determined at parse time. From there you can plan however you
like.

AFAIUI, collate is a parameter/property of fields and domains and
affects operators and function, not the data. It only applies to
comparisons, not the output. You could add it as a property to the
data. I wrote a module, taggedtypes [1], which basically implemented
this. My main issue with it is that for the '' operator, the same
collate property has to be on both arguments or it has to bail. The
only reason why you can attach COLLATE to fields and domains is to give
a default in case the user doesn't specify anything. But if the COLLATE
is given explicitly, it overrides anything.

By way of example:

CREATE TABLE t (a text collate c1, b text collate c2);

select * from t where a  b;  -- ERROR: Indeterminate collate
select * from t where a  b COLLATE c3;   -- Order by c3

My worry about adding the collate to the Datum is that your execution
tree becomes more complex. The two types, with and without COLLATE data
are not interchangable and you would have to add or remove them
continuously.

In the above example, you would have to create an executor node whose
sole purpose is to add the collate bit to a and b before passing it to
the '' operator. An index can only support one collation at a time
also.

I don't think there is an easy way out...

[1] http://svana.org/kleptog/pgsql/taggedtypes.html
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgpZxzfFhqYrs.pgp
Description: PGP signature


[HACKERS] Does anybody use ORDER BY x USING y?

2005-09-18 Thread Martijn van Oosterhout
Hi,

PostgreSQL's grammer allows you to specify the operator to sort with in
the ORDER BY clause. Various bits of the backend support this feature,
yet it appears to partially undocumented. I can't find it in the ORDER
BY [1] section but there is a paragraph on it under the SELECT
documentation [2].

I'm asking because SQL COLLATE support is really doing something
similar. I was wondering if instead of adding something in parallel just
replace sortop with collateid. This means all the code relating to
pathkeys won't need to change since we still use OIDs for the pathkeys,
they're just not operator oids anymore.

We can continue to support USING [op] as long as [op] is one of the GT
or LT operators in the OPERATOR CLASS. This restriction may exist
already, I can't tell.

All we lose is the ability to say USING [arbitrary op]. Does anybody
use this. Would people object to requiring the operator after USING to
be part of an operator class?

Have a nice day,

[1] http://www.postgresql.org/docs/8.0/interactive/queries-order.html
[2] http://www.postgresql.org/docs/8.0/interactive/sql-select.html
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgp7TMawPFCRG.pgp
Description: PGP signature


Re: [HACKERS] Does anybody use ORDER BY x USING y?

2005-09-18 Thread Martijn van Oosterhout
On Sun, Sep 18, 2005 at 12:34:10PM -0700, Josh Berkus wrote:
  All we lose is the ability to say USING [arbitrary op]. Does anybody
  use this. Would people object to requiring the operator after USING to
  be part of an operator class?
 
 Hmmm ... would this prevent the hackish workaround for case-insensitive sort?

Err, which hackish workaround would that be? The right solution is
citext which creates it's own operator class. This doesn't have
anything to do with functional indexes either.

I've been using Google to find any interesting use of the USING clause
but havn't found any yet.

Have a nice day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgpQyeqL55uBv.pgp
Description: PGP signature


Re: [HACKERS] Does anybody use ORDER BY x USING y?

2005-09-18 Thread Martijn van Oosterhout
On Sun, Sep 18, 2005 at 04:19:06PM -0400, Andrew Dunstan wrote:
 Err, which hackish workaround would that be? The right solution is
 citext which creates it's own operator class. This doesn't have
 anything to do with functional indexes either.
 
 Last time I looked it appeared to have significant limitations, and some 
 considerable inefficiencies (e.g, copying the strings and folding them 
 to canonical case on every comparison). I would certainly be extremely 
 wary of just saying that's the solution.

Ok, so citext has its limitations. Case-insensetive sort is hard [1].
My real question was, what was the solution he was referring to using
the USING clause?

[1] http://lafstern.org/matt/col2_new.pdf

-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgp8mAO26x7IC.pgp
Description: PGP signature


Re: [HACKERS] Does anybody use ORDER BY x USING y?

2005-09-18 Thread Martijn van Oosterhout
On Mon, Sep 19, 2005 at 06:26:10AM +1000, John Hansen wrote:
 I was actually of the impression that that was exacty what it was for:
 specifying what op(class) to use for the sort in case you wanted to use
 a non-default opclass for the type, and/or if the less-than operator
 wasn't called ''.

That's my thought. However, the code doesn't seem to restrict you to
that so I was wondering if there was any other use out there that we
should consider supporting...
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgpZSXRjLfg74.pgp
Description: PGP signature


Re: [HACKERS] Does anybody use ORDER BY x USING y?

2005-09-19 Thread Martijn van Oosterhout
On Sun, Sep 18, 2005 at 11:23:01PM -0400, Tom Lane wrote:
snip
 class families to relate opclasses for different datatypes.  Basically
 I'd like to solve most of these issues by constructing a new layer atop
 opclasses, not by deciding that an opclass doesn't convey the full story
 about the behavior of an index column.

Where I'm currently going is creating a table of COLLATE orders. These
collate orders would refer to operator classes but tweak them. For
example, things like:

- Sort ascending or descending (descending reverses the bt*cmp test)
- NULLs first or last
- Locale for text types
- etc

They could be declared in the operator class definition, or generated
automatically. You could then do things like:

CREATE INDEX ... (field1 COLLATE ascending, field2 COLLATE descending)

for those queries where you want ascending on one column and descending
on another. Or perhaps:

CREATE INDEX ... (textfield COLLATE ignore_case)
CREATE INDEX ... (textfield COLLATE locale_us)
CREATE INDEX ... (textfield COLLATE optimise_regex)
CREATE INDEX ... (point COLLATE distance)

However, I can't see how this can relate families of operator classes
like you talk about Tom. ISTM that needs to dealt with somewhere else,
given that it's unrelated to order.

This is going way out of spec though...
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgpZ1XhWSBlJF.pgp
Description: PGP signature


Re: [HACKERS] Does anybody use ORDER BY x USING y?

2005-09-19 Thread Martijn van Oosterhout
On Mon, Sep 19, 2005 at 11:13:05AM +0300, Hannu Krosing wrote:
  (1) IS NULL is not an indexable operation, so no, not without
  significant overhaul of the index AM API.
 
 But we do store NULLs in indexes, so why is it not indexable?
 
 This is either an interface bug (not making use of stored info) or
 storage bug (wasting space storing unneccessary info)

Err, indexes used to not store NULLs to save space. However, it turns
out that SQL UNIQUE has something to say about NULLs in unique columns
so they had to be included.

However, the machinary to decide if an index is usable assumes that
usable operators have two arguments and IS NULL isn't really an
operator in the PostgreSQL sense and doesn't have two arguments either.

*If* that can be fixed, then we can be more flexible. But if it were
easy it would have been done long ago...

Have a nice day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgpdwzN3w7ZIG.pgp
Description: PGP signature


Re: [HACKERS] Does anybody use ORDER BY x USING y?

2005-09-20 Thread Martijn van Oosterhout
much discussion on collation and ordering

I'm going to take from this discussion that there is no use for the
USING clause with operators not in an operator class and that if this
changes we won't be seriously inconveniencing anybody.

Have a nice day,

-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgp0oxevEpZEf.pgp
Description: PGP signature


Re: [HACKERS] passing parameters to CREATE INDEX

2005-09-20 Thread Martijn van Oosterhout
On Tue, Sep 20, 2005 at 11:26:26PM +0400, Oleg Bartunov wrote:
 it's desirable to be able to pass parameters to CREATE INDEX for
 GiST indices. Does SQL standard has something about that so we could
 implement it for 8.2 ?

As has been pointed out, INDEXes arn't in the SQL spec at all, so you
can do just about anything.

 Example from real life project - performance of tsearch2 could be
 greatly improved if decrease signature size in gistidx.h, which is
 currently hardcoded and one should compile and install tsearch2 into
 differnet location and use it for specific database. It's impossible
 to have different signature length for different fts indices because
 we have no possibility to pass parameters to CREATE INDEX command.

What syntax were you envisioning? Does this value just need to be
passed to GiST at the creation of the the index, or does it actually
need to remembered by the backend and passed each call? At the moment
there is some discussion on changes to the index interface so now is
the time to ask for what you want...

-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgpK0N8FPyZ9w.pgp
Description: PGP signature


Re: [HACKERS] passing parameters to CREATE INDEX

2005-09-21 Thread Martijn van Oosterhout
On Wed, Sep 21, 2005 at 08:47:04AM +0300, Hannu Krosing wrote:
 On K, 2005-09-21 at 09:01 +0400, Oleg Bartunov wrote:
  it'd be nice if parameters could be passed at the creation time only and
  somehow stored, so other functions could retrieve them. It's not 
  convenient but also safe.
 
 If not changing syntax is essential, then these could be passed by some
 GUC variables at index create time, then stored. 
 
 This way one could have as many configurables a one likes .

The only major problem with that is that the parameters won't survive a
dump/restore. I don't know enough about what's it's needed for to know
if that's a problem...

So even if an index can store the parameter itself, there would need to
be a way for pg_dump to extract it.
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgpp5w9mVJ5or.pgp
Description: PGP signature


Re: [HACKERS] Table Partitioning is in 8.1

2005-09-21 Thread Martijn van Oosterhout
On Wed, Sep 21, 2005 at 06:10:15PM +0100, Simon Riggs wrote:
 Table Partitioning is in 8.1
 
 I've just read Peter Eisentraut's presentation to the Dutch gov (very
 good BTW). On the last page I read that Table Partitioning is a future
 for PostgreSQLwhich is strange because Constraint Exclusion is an
 8.1 feature.

I think it depends on how closely you relate Table Partitioning and
Constraint Exclusion.  While you can use the latter to implement the
former, I think people expect a database that supports Table
Partitioning to have a CREATE PARTITION command (or something
similar).

 Is it possible that the Release Notes do not fully explain the
 Constraint Exclusion feature? Or is it the consensus that it works but
 not quite well enough to make a song and dance about yet?

IMHO, I think Constraint Exclusion is more useful in general and once
it is in and tested future releases will have a partitioning system
based upon it.

Have a nice day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgpqOBQjmGisQ.pgp
Description: PGP signature


Re: [HACKERS] Releasing memory during External sorting?

2005-09-23 Thread Martijn van Oosterhout
On Fri, Sep 23, 2005 at 06:39:35PM +0200, Pailloncy Jean-Gerard wrote:
 On most platforms it's quite unlikely that any memory would actually
 get released back to the OS before transaction end, because the
 memory blocks belonging to the tuplesort context will be intermixed
 with blocks belonging to other contexts.  So I think this is pretty
 pointless. (If you can't afford to have the sort using all of
 sort_mem, you've set sort_mem too large, anyway.)

 On OpenBSD 3.8 malloc use mmap(2) and no more sbrk.
 So, as soon as the bloc is free, it returns to the OS.
 Access to the freed pointer crashs immediatly.

Interesting point. Glibc also uses mmap() but only for allocations
greater than a few K, otherwise it's a waste of space.

I guess you would have to look into the postgresql allocator to see if
it doesn't divide the mmap()ed space up between multiple contexts.
Large allocations certainly appear to be passed off to malloc() but I
don't think execSort allocates all it's space in one go, it just counts
the space allocated by palloc().

So, unless someone goes and adds changes the tuplesort code to allocate
big blocks and use them only for tuples, I think you're going to run
into issues with data interleaved, meaning not much to give back to the
OS...
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgpZK896EQRZq.pgp
Description: PGP signature


Re: [HACKERS] stack depth limit exceeded problem.

2005-09-24 Thread Martijn van Oosterhout
On Sat, Sep 24, 2005 at 10:34:42AM +0200, Thomas Hallgren wrote:
 Oliver Jowett wrote:
 I assume this means you have a single lock serializing requests to the
 backend?
  
 Yes, of course. I also make sure that the main thread cannot return 
 until another thread that is servicing a backend request has completed. 
 There's absolutely no way two threads can execute backend code 
 simultaniously.

Ok, I have a question. PostgreSQL uses sigsetjmp/siglongjmp to handle
errors in the backend. If you're changing the stack, how do you avoid
the siglongjmp jumping back to a different stack? Or do you somehow
avoid this problem altogether?

 I though about that. The drawback is that each and every call must spawn 
 a new thread, no matter how trivial that call might be. If you do a 
 select from a table with 10,000 records and execute a function for each 
 record, you get 20,000 context switches. Avoiding that kind of overhead 
 is one of the motivating factors for keeping the VM in-process.

Well, on linux at least context switches are quite cheap. However, how
does Java handle the possibility that functions never return. Do you
wrap each call in a PG_TRY/PG_CATCH to propegate errors?

Tricky issues...
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgpWDeiFxvZex.pgp
Description: PGP signature


Re: [HACKERS] stack depth limit exceeded problem.

2005-09-24 Thread Martijn van Oosterhout
On Sat, Sep 24, 2005 at 12:26:58PM +0200, Thomas Hallgren wrote:
 Yes. All backend exceptions are cought in a PG_CATCH and then propagated 
 to Java as a ServerException. If there's no catch in the Java code, they 
 are rethrown by the java_call_handler. This time with jump buffer that 
 was setup by the backend when it invoked the call_handler.
 
 There's also a barrier that will prevent any further calls from the Java 
 code once an exception has been thrown by the backend unless that call 
 was wrapped in a savepoint construct. A savepoint rollback will unlock 
 the barrier (this is not related to the thread issue of course).

Well, you seem to have dealt with the obvious issues I can see. I
imagine you need also to worry about things like signal handling. Is
there no way to reserve a stack just for PostgreSQL and switch to that
stack, rather than switch threads (although, the stack is really the
only thing that differentiates threads anyway...).

Linux has sigaltstack so you can catch the stack overflow signal (and
other signals obviously, but that's its main use), but it's not terribly
portable. What you really need to do is set the stack_base_ptr every
time you execute postgres with a new stack; that preserves existing
semantics.

Signals are the only way the kernel can pass control unexpectedly so if
you handle those, postgres would never know it's threaded. I do wonder
if there are any other assumptions made...

Have a nice day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgpAoT6QNSsBT.pgp
Description: PGP signature


Re: [HACKERS] stack depth limit exceeded problem.

2005-09-24 Thread Martijn van Oosterhout
On Sat, Sep 24, 2005 at 02:38:35PM +0200, Thomas Hallgren wrote:
 Martijn van Oosterhout wrote:
  Linux has sigaltstack so you can catch the stack overflow signal (and
  other signals obviously, but that's its main use), but it's not terribly
  portable.
 
 I rely on the signal handler that the JVM uses for page-faults (which a 
 stack overflow generally amounts to) and fpe exeptions so I know that 
 they will generate java exceptions in a controlled way (which I in turn 
 translate to elog(ERROR) on the main thread).

Well, actually, what I was thinking is if someone sends a -INT or -TERM
to the backend, which thread will catch it? You have to block it in
every thread except the one you want to catch it in if you want to
control it. This means that for any signal handler that PostgreSQL
installs, you need to intercept it with a wrapper function to make sure
it runs in the right stack.

Actually, while running backend code, you're probably fine since the
elog stuff will handle it. But if a signal is received while the JVM is
running, the signal handler will get the stack of the JVM. Now,
PostgreSQLs signal handlers tend not to do much so you may be safe.
They tend not to throws errors, but who knows...

Still, this is all solvable I think...
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgpLeqyX5gIeM.pgp
Description: PGP signature


[HACKERS] \x output blowing up

2005-09-24 Thread Martijn van Oosterhout
Hi,

On CVS tip, whenever I do \x output, it dies with an internal error in
glibc on free. If you run it under valgrind, it complains about these
lines of code:

700{  
701char *my_cell = pg_local_malloc(cell_w[i] + 1);
702
703 [Inv write 1 byte] strcpy(my_cell, *ptr);
704if (opt_align[i % col_count] == 'r'  
opt_numeric_locale)
705format_numeric_locale(my_cell);
706if (opt_border  2)
707fprintf(fout, %s\n, my_cell);
708else
709 [Inv read 1 byte]  fprintf(fout, %-s%*s |\n, my_cell, dwidth 
- cell_w[i], );
710free(my_cell);
711}

Now, apart from the fact that the cell width != strlen in multibyte
encodings, there must be something else because this is just select *
from pg_proc and there are no multiple characters there AFAIK. I can't
see it though.

Hope this helps,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgpid8LNrp3Xr.pgp
Description: PGP signature


Re: [HACKERS] \x output blowing up

2005-09-24 Thread Martijn van Oosterhout
On Sat, Sep 24, 2005 at 11:45:08PM +0200, Martijn van Oosterhout wrote:
 Hi,
 
 On CVS tip, whenever I do \x output, it dies with an internal error in
 glibc on free. If you run it under valgrind, it complains about these
 lines of code:

snip

Ok, I worked out the direct cause, pg_wcswidth only returns the length
upto the first newline and the line it breaks on is the multiline
definition in _pg_expandarray.

The quick fix should be to only allocate memory if it's going to call
format_numeric_locale(), since then you know it's a number. It makes
the code slightly more convoluated but it should be slightly more
efficient.

I actually have a working psql that handles and displays newlines
properly, but it's too late for 8.1. It fixes all these issues
properly.

Have a nice day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgpSxRFAxuskf.pgp
Description: PGP signature


Re: [HACKERS] \x output blowing up

2005-09-25 Thread Martijn van Oosterhout
On Sat, Sep 24, 2005 at 07:18:16PM -0400, Bruce Momjian wrote:
 
 Well, it seems we are going to have to fix it somehow for 8.1.  It is
 not crashing here so I can't work up a patch.  Can you submit a minimal
 fix for 8.1?  Thanks.

Ah, it would only happen if your encoding was UTF-8 since that's the
only case psql handles differently. I've attached a patch which fixes
it. With a bit more rearrangement you could probably simplify it a bit
but this works.

Have a nice day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.
Index: print.c
===
RCS file: /projects/cvsroot/pgsql/src/bin/psql/print.c,v
retrieving revision 1.74
diff -c -r1.74 print.c
*** print.c 24 Sep 2005 17:53:27 -  1.74
--- print.c 25 Sep 2005 09:16:01 -
***
*** 697,714 
else
fputs( , fout);
  
{
char *my_cell = pg_local_malloc(cell_w[i] + 1);
  
strcpy(my_cell, *ptr);
!   if (opt_align[i % col_count] == 'r'  
opt_numeric_locale)
!   format_numeric_locale(my_cell);
if (opt_border  2)
fprintf(fout, %s\n, my_cell);
else
fprintf(fout, %-s%*s |\n, my_cell, dwidth - 
cell_w[i], );
free(my_cell);
}
}
  
if (opt_border == 2)
--- 697,726 
else
fputs( , fout);
  
+   /* We seperate the cases for numeric_locale because for UTF-8 
(and
+* other) encodings, cell_w = strlen(*ptr). We can't just use
+* strlen() in the malloc because there needs to be enough room 
for
+* format_numeric_locale() to store its result. */
+ 
+   if (opt_align[i % col_count] == 'r'  opt_numeric_locale)
{
char *my_cell = pg_local_malloc(cell_w[i] + 1);
  
strcpy(my_cell, *ptr);
!   format_numeric_locale(my_cell);
if (opt_border  2)
fprintf(fout, %s\n, my_cell);
else
fprintf(fout, %-s%*s |\n, my_cell, dwidth - 
cell_w[i], );
free(my_cell);
}
+   else
+   {
+   if (opt_border  2)
+   fprintf(fout, %s\n, *ptr);
+   else
+   fprintf(fout, %-s%*s |\n, *ptr, dwidth - 
cell_w[i], );
+   }
}
  
if (opt_border == 2)


pgpOenUi8TQtR.pgp
Description: PGP signature


[HACKERS] Questions about proper newline handling in psql output

2005-09-25 Thread Martijn van Oosterhout
Hi,

I basically have a functional version for aligned output, examples at
the bottom of this email. It handles multiline data values and
multiline headers. However, there are some areas where I could use some
input.

1. To be able to control the spacing, psql now has to be very careful
about its output. eg \r is printed as \r, ascii control characters are
output as \x00 style and other control chars as \u. This is a
change from previous behaviour, yet you're pretty much forced to if you
want to control the output.

Is this change acceptable?

2. Currently I've changed the aligned outputs but not the unaligned
ones. Given you're not worrying about alignment there anyway, why do
the work? Also, we recommend unaligned output for script users so I
don't feel right changing it.

Is this distinction acceptable?

3. How to show that a value is continued? As you can see below I use
':' before columns that have data. This obviously doesn't work for
first column if there's no outer border. If your border style is 0
you're totally out of luck.

I remember a discussion on this before but couldn't find it in the
archives. Either a reference or some other hints would be appreciated.

4. Some system output like pg_views has really really long strings,
would it be acceptable to change the output there to add newlines at
various places to make it output nicer with this change?

5. Auto string folding. If a string is really long, fold it so it fits
in a screen width, perhaps with '\' continuation. I havn't done this
but I can imagine some people (including me) would love it.

6. Currently I've implemented support for UTF-8 and all ASCII
compatable single-byte encodings. Given that psql didn't support the
others anyway maybe no-one cares, but I have to ask: does anyone care?
If so, I need info on *how* to support an encoding.

Thanks for your attention. See you tomorrow.

Query is: select oid, prosrc as HdrLine1
HdrLine2, proacl from pg_proc limit 1;

Border style is 1.
  oid  |  HdrLine1   | 
proacl 
   |  HdrLine2   |  
  
---+-+
 17009 | select 1 union all select 2 union all select 3 union all|  
 
   : select 4 union all select 5 union all select 6 union all   
 
   : select 7 union all select 8 union all select 9 union all   
 
   : select 10 union all select 11 union all select 12 union all
 
   : select 13 union all select 14 union all select 15 union all
 
   : select 16 union all select 17 union all select 18 union all
 
   : select 19 union all select 20 union all select 21 union all
 
   : select 22 union all select 23 union all select 24 union all
 
   : select 25 union all select 26 union all select 27 union all
 
   : select 28 union all select 29 union all select 30 union all
 
   : select 31 union all select 32  
 
(1 row)

Expanded display is on.
-[ RECORD 1 ]-
oid  | 17009
HdrLine1 | select 1 union all select 2 union all select 3 union all
HdrLine2 : select 4 union all select 5 union all select 6 union all
 : select 7 union all select 8 union all select 9 union all
 : select 10 union all select 11 union all select 12 union all
 : select 13 union all select 14 union all select 15 union all
 : select 16 union all select 17 union all select 18 union all
 : select 19 union all select 20 union all select 21 union all
 : select 22 union all select 23 union all select 24 union all
 : select 25 union all select 26 union all select 27 union all
 : select 28 union all select 29 union all select 30 union all
 : select 31 union all select 32
proacl   | 

# select chr(8);
 chr  
--
 \x08
(1 row)


-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgpMYIVtyDxCp.pgp
Description: PGP signature


Re: [HACKERS] expected authentication request from server, but received...

2005-09-26 Thread Martijn van Oosterhout
On Mon, Sep 26, 2005 at 09:59:53AM +0200, Micha? J?czalik wrote:
 Connection with psql works OK. Connecting from different (older) hosts 
 with DBI/PHP works perfect as well. Something must have changed in recent 
 versions of libpq and now it doesn't work. I looked into source code and 
 found that the server's message at the very start of conversation appears 
 to be something else than the libpq expects. But how to fix it?! Has the 
 protocol change or what?

Umm, I think the protocol version is up to 3 or 4 now. I think libpq
supports all the way back to 7.0, I don't know if anyone tests earlier
than that.

If you really need this to work, I suggest pulling the libpq from that
release (CVS or tarball) or slightly later and installing that on the
new machines.

But you should upgrade, 6.4 hasn't had support for year, who know how
many bugs...
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgpteoQYstgAk.pgp
Description: PGP signature


Re: [HACKERS] Questions about proper newline handling in psql output

2005-09-27 Thread Martijn van Oosterhout
On Tue, Sep 27, 2005 at 03:02:29PM +0200, Peter Eisentraut wrote:
 I don't think this is necessary.  If you put control characters into your 
 text, then you clearly don't care about aligned output, so you don't get any. 
  
 About \r, I think that needs to be figured into the alignment calculation, to 
 work well on Windows and Mac.

Good point. On UNIX however, outputting a \r will jump back to the
beginning of the line, overwriting whatever was there. But you raise a
good question, should the output of psql be console specific. i.e. if
you insert a value with unix newlines then on a Mac the output won't
have linebreaks.

However, part of this discussion was because we were considering
changing the output of \df to display more info using multiple lines.
And if so we want it to display consistantly on all platforms, right?

My argument for the other control characters is: given the work to make
this work for \r, \t, and \n needs all this anyway, why not just fix it
for *all* control characters in one go and be completely solved of the
problem, for now and forever. If someone embeds the control characters
to change the title of your xterm, change the font, clear the screen,
etc should psql just blat that out? I realise it does it now and that
it's not a strong argument, but since we're here already...

  3. How to show that a value is continued? As you can see below I use
  ':' before columns that have data. This obviously doesn't work for
  first column if there's no outer border. If your border style is 0
  you're totally out of luck.
 
 I think you need to keep the normal delimiter and need some extra mark within 
 the table cells.  (Think about how it would have to look in an HTML table.)

Umm, I wasn't thinking of changing the HTML output at all, it doesn't
need it since whatever displays the HTML will take care of alignment.
Same for troff and CSV. Straight aligned text output is the only one we
care about AFAICS (and the only one that requires work to make it
happen).

Have a nice day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgpkWnlydcyP5.pgp
Description: PGP signature


Re: [HACKERS] Questions about proper newline handling in psql output

2005-09-27 Thread Martijn van Oosterhout
On Tue, Sep 27, 2005 at 12:12:15AM -, Greg Sabino Mullane wrote:
  4. Some system output like pg_views has really really long strings,
  would it be acceptable to change the output there to add newlines at
  various places to make it output nicer with this change?
 
 I'd say no, until you propose a more concrete set of rules on how and when.

Well, I was thinking before the keywords SELECT, FROM, WHERE, AND,
ORDER BY, GROUP BY and HAVING. For bonus indent subqueries also. But
I'm not too fussed, it was just a thought. In fact, we already do it
for the output of \d for views, some maybe a hint from there...

  Query is: select oid, prosrc as HdrLine1 HdrLine2, proacl from pg_proc 
  limit 1;
 
 Some smaller samples with a third column (and a large 2nd one) might be nice.

Ok, multiline strings in the pg_catalog are thin on the ground, I was
hoping to use real data rather than stuff I made up. (Real data has a
habit of showing weaknesses far better than things you dream up). But
I'll give it a shot.
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgpJ5nCy3zGIr.pgp
Description: PGP signature


Re: [HACKERS] effective SELECT from child tables

2005-09-28 Thread Martijn van Oosterhout
On Wed, Sep 28, 2005 at 11:00:46PM +0400, Ilia Kantor wrote:
 Your suggestion is essentially the same as mine.. 
 There exists tableoid, pretty much suited to tell between tables in the case
 of inheritance.. I can't see a real need to add a special class
 classifier to each table..
 This solution is a workaround. It will work, just can't make myself love it.

I wonder if it would be possible to tweak the constraints exclusion
code so that if it sees something of the form tableoid = X to exclude
other tables...

You know, assume each table has a constraint tableoid = OID.

Still, it is a fairly unusual feature.
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgp13avInTZnX.pgp
Description: PGP signature


Re: [HACKERS] [PATCHES] Proposed patch for sequence-renaming problems

2005-09-29 Thread Martijn van Oosterhout
On Wed, Sep 28, 2005 at 10:41:12PM -0400, Bruce Momjian wrote:
 
 Also, why is the nextval ::text casting output by pg_dump anyway?

AFAICS, pg_dump outputs serial (at least in 7.4.7 which is what I
have to hand) when it should meaning that dumps restored will get the
new syntax anyway. Or am I missing something?

Isn't this what adddepend was for? I can beleive there are version
dependancies here...
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgpLmwKsmtxrG.pgp
Description: PGP signature


Re: [HACKERS] Socket problem using beta2 on Windows-XP

2005-09-29 Thread Martijn van Oosterhout
On Thu, Sep 29, 2005 at 08:50:30AM +0200, Thomas Hallgren wrote:
 Hi,
 I've installed PostgreSQL 8.1-beta2 as a service on my Windows-XP box. 
 It runs fine but I get repeated messages like this in the log:
 
   2005-09-29 00:41:09 FATAL:  could not duplicate socket 1880 for use 
 in backend: error code 10038

That's from postmaster.c:write_inheritable_socket(). Error 10038 is
WSAENOTSOCK. Very odd, time to get out the debugger? Get a backtrace at
least.

Hope this helps,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgpH0H2mRTE8a.pgp
Description: PGP signature


Re: [HACKERS] Found small issue with OUT params

2005-09-30 Thread Martijn van Oosterhout
On Fri, Sep 30, 2005 at 10:20:34AM -0500, Tony Caduto wrote:
 Tom,
 I hardly think the overhead would be significant on modern processors, I 
 don't think the majority of users are running on Pentium 90s.( I am 
 assuming you mean a performance overhead)

Um, please read the documention. Returning a tuple is *significantly*
more expensive than returning a single value. You have to get the tuple
descriptor, allocate memory for the tuple, fill in all the fields with
your data... For a single value you just return it.

See here for all the details, you really don't want to do it if you
don't need to.

http://www.postgresql.org/docs/8.0/interactive/xfunc-c.html#AEN30497

Now, you could fudge the parser to automatically alter the name of the
value in the function but I'm have no idea how hard that would be...
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgp0BOnOk9s4S.pgp
Description: PGP signature


Re: [HACKERS] effective SELECT from child tables

2005-10-01 Thread Martijn van Oosterhout
On Fri, Sep 30, 2005 at 06:30:10PM -0500, Jim C. Nasby wrote:
 On Wed, Sep 28, 2005 at 07:25:46PM +0100, Simon Riggs wrote:
  Include the Discriminator as a column in A and it will be inherited by
  all A1, A2, A3. 
  e.g. concrete_class char(1) not null
 snip
  This will add 1 byte per row in your superclass... and requires no
 
 I thought char was actually stored variable-length...? I know there's a
 type that actually acts like char does on most databases, but I can't
 remember what it is off-hand (it should be mentioned in docs 8.3...)

IIRC, this is the difference between char and char(1). The latter is
variable length and can store any character per current encoding, hence
the variable length. char on the other hand is a one byte (presumably
ASCII) character. It's used mainly in the system catalogs...
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgpNCkP1i0Zwi.pgp
Description: PGP signature


Re: [HACKERS] effective SELECT from child tables

2005-10-01 Thread Martijn van Oosterhout
On Sat, Oct 01, 2005 at 10:05:22AM -0400, [EMAIL PROTECTED] wrote:
 It has the 'side or additional benefit' being requested here. The ability
 to filter the child table by some attribute. For example, if the child
 tables are used for partitioning, and the attribute were to keep a date
 range, the field restriction optimization could be used to automatically
 determine the set of tables to use for the date range specified. With
 such a change, it would even work automatically if the date ranges
 overlapped for some reason. Selecting a table name by date is hacky. This
 sort of solution would be a general solution to the problem.

This is what Constraint Exclusion does. It uses CHECK constraints on
a table to filter out tables that obviously don't apply to a query.
It's just the the specific case of tableoid = XXX is not supported
right now.

Have a nice day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgpYstb0MItkV.pgp
Description: PGP signature


Re: [HACKERS] [PERFORM] A Better External Sort?

2005-10-01 Thread Martijn van Oosterhout
On Sat, Oct 01, 2005 at 10:22:40AM -0400, Ron Peacetree wrote:
 Assuming we get the abyssmal physical IO performance fixed...
 (because until we do, _nothing_ is going to help us as much)

I'm still not convinced this is the major problem. For example, in my
totally unscientific tests on an oldish machine I have here:

Direct filesystem copy to /dev/null
21MB/s10% user 50% system  (dual cpu, so the system is using a whole CPU)

COPY TO /dev/null WITH binary
13MB/s55% user 45% system  (ergo, CPU bound)

COPY TO /dev/null
4.4MB/s   60% user 40% system

\copy to /dev/null in psql
6.5MB/s   60% user 40% system

This machine is a bit strange setup, not sure why fs copy is so slow.
As to why \copy is faster than COPY, I have no idea, but it is
repeatable. And actually turning the tuples into a printable format is
the most expensive. But it does point out that the whole process is
probably CPU bound more than anything else.

So, I don't think physical I/O is the problem. It's something further
up the call tree. I wouldn't be surprised at all it it had to do with
the creation and destruction of tuples. The cost of comparing tuples
should not be underestimated.
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgpEek2beyKVc.pgp
Description: PGP signature


Re: [HACKERS] effective SELECT from child tables

2005-10-01 Thread Martijn van Oosterhout
Hmm, I'm trying to understand here. If every row in log_8 should have
the same project_id, couldn't this be acheived by having each row in log_other
contain the tableoid of the table it refers to. Then a join will return
the info you're looking for.

Or am I missing something?

On Sat, Oct 01, 2005 at 10:57:27AM -0500, Jim C. Nasby wrote:
 To clarify, this is a hard-coded implementation of what I'm asking for:
 http://cvs.distributed.net/viewcvs.cgi/stats-sql/logdb/ in a nutshell:
 
 CREATE TABLE log_other (
 project_id  smallint NOT NULL
 ...
 )
 
 CREATE TABLE log_8 (
 -- No project_id
 ...
 )
 CREATE TABLE log_24, log_25, log_5...
 CREATE VIEW log AS
 SELECT * FROM log_other
 UNION ALL SELECT 8 AS project_id, * FROM log_8
 ...
 
 So the end result is that for cases where project_id is 5, 8, 24, or 25,
 the data will be stored in tables that don't have the project_id.
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgp2qncxxNGXp.pgp
Description: PGP signature


Re: [HACKERS] [PERFORM] A Better External Sort?

2005-10-01 Thread Martijn van Oosterhout
[removed -performance, not subscribed]

On Sat, Oct 01, 2005 at 01:42:32PM -0400, Ron Peacetree wrote:
 You have not said anything about what HW, OS version, and pg version
 used here, but even at that can't you see that something Smells Wrong?

Somewhat old machine running 7.3 on Linux 2.4. Not exactly speed
daemons but it's still true that the whole process would be CPU bound
*even* if the O/S could idle while it's waiting. PostgreSQL used a
*whole CPU* which is its limit. My point is that trying to reduce I/O
by increasing CPU usage is not going to be benficial, we need CPU usage
down also.

Anyway, to bring some real info I just profiled PostgreSQL 8.1beta
doing an index create on a 2960296 row table (3 columns, table size
317MB).

The number 1 bottleneck with 41% of user time is comparetup_index. It
was called 95,369,361 times (about 2*ln(N)*N). It used 3 tapes. Another
15% of time went to tuplesort_heap_siftup.

The thing is, I can't see anything in comparetup_index() that could
take much time. The actual comparisons are accounted elsewhere
(inlineApplySortFunction) which amounted to 10% of total time. Since
nocache_index_getattr doesn't feature I can't imagine index_getattr
being a big bottleneck. Any ideas what's going on here?

Other interesting features:
- ~4 memory allocations per tuple, nearly all of which were explicitly
freed
- Things I though would be expensive, like: heapgettup and
myFunctionCall2 didn't really count for much.

Have a nice weekend,

  %   cumulative   self  self total   
 time   seconds   secondscalls   s/call   s/call  name
 43.63277.81   277.81 95370055 0.00 0.00  comparetup_index
 16.24381.24   103.43  5920592 0.00 0.00  tuplesort_heap_siftup
  3.76405.1723.93 95370055 0.00 0.00  inlineApplySortFunction
  3.18425.4220.26 95370056 0.00 0.00  btint4cmp
  2.82443.3717.95 11856219 0.00 0.00  AllocSetAlloc
  2.52459.4416.07 95370055 0.00 0.00  myFunctionCall2
  1.71470.3510.91  2960305 0.00 0.00  heapgettup
  1.26478.38 8.03 11841204 0.00 0.00  GetMemoryChunkSpace
  1.14485.67 7.29  5920592 0.00 0.00  tuplesort_heap_insert
  1.11492.71 7.04  2960310 0.00 0.00  index_form_tuple
  1.09499.67 6.96 11855105 0.00 0.00  AllocSetFree
  0.97505.83 6.17 23711355 0.00 0.00  AllocSetFreeIndex
  0.84511.19 5.36  5920596 0.00 0.00  LogicalTapeWrite
  0.84516.51 5.33  2960314 0.00 0.00  slot_deform_tuple
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgpfhYaBi5xFJ.pgp
Description: PGP signature


Re: [HACKERS] [PERFORM] A Better External Sort?

2005-10-02 Thread Martijn van Oosterhout
On Sat, Oct 01, 2005 at 11:26:07PM -0400, Tom Lane wrote:
 Martijn van Oosterhout kleptog@svana.org writes:
  Anyway, to bring some real info I just profiled PostgreSQL 8.1beta
  doing an index create on a 2960296 row table (3 columns, table size
  317MB).
 
 3 columns in the index you mean?  What were the column datatypes?
 Any null values?

Nope, three columns in the table, one column in the index, no nulls.
The indexed column was integer. I did it once with around 6500 values
repeated over and over, lots of duplicate kays. And once on a serial
column but it made no descernable difference either way. Although the
comparison function was called less (only 76 million times), presumably
because it was mostly sorted already.

  The number 1 bottleneck with 41% of user time is comparetup_index.
  ...
  The thing is, I can't see anything in comparetup_index() that could
  take much time.
 
 The index_getattr and heap_getattr macros can be annoyingly expensive.

And yet they are optimised for the common case. nocache_index_getattr
was only called 7 times, which is about what you expect. I'm getting
annotated output now, to determine which line takes the time...
Actually, my previous profile overstated stuff a bit. Profiling turned
off optimisation so I put it back and you get better results but the
order doesn't change much. By line results are below.

The top two are the index_getattr calls in comparetup_index. Third and
fourth are the HEAPCOMPARES in tuplesort_heap_siftup. Then comes the
inlineApplySortFunction call (which isn't being inlined, despite
suggesting it should be, -Winline warns about this).

Looks to me that there are no real gains to be made in this function.
What is needed is an algorithmic change to call this function less
often...

Have a nice weekend,

  %   cumulative   self  self total   
 time   seconds   secondscalls  ms/call  ms/call  name
  9.40 22.5622.56 comparetup_index 
(tuplesort.c:2042 @ 8251060)
  5.07 34.7312.17 comparetup_index 
(tuplesort.c:2043 @ 82510c0)
  4.73 46.0911.36 tuplesort_heap_siftup 
(tuplesort.c:1648 @ 825074d)
  3.48 54.45 8.36 tuplesort_heap_siftup 
(tuplesort.c:1661 @ 82507a9)
  2.80 61.18 6.73 comparetup_index 
(tuplesort.c:2102 @ 8251201)
  2.68 67.62 6.44 comparetup_index 
(tuplesort.c:2048 @ 8251120)
  2.16 72.82 5.20 tuplesort_heap_siftup 
(tuplesort.c:1652 @ 825076d)
  1.88 77.34 4.52 76025782 0.00 0.00  comparetup_index 
(tuplesort.c:2016 @ 8251010)
  1.82 81.70 4.36 76025782 0.00 0.00  inlineApplySortFunction 
(tuplesort.c:1833 @ 8251800)
  1.73 85.85 4.15 readtup_heap 
(tuplesort.c:2000 @ 8250fd8)
  1.67 89.86 4.01 AllocSetAlloc (aset.c:568 
@ 824bec0)
  1.61 93.72 3.86 comparetup_index 
(tuplesort.c:2025 @ 825102f)
  1.47 97.25 3.53 76025785 0.00 0.00  btint4cmp 
(nbtcompare.c:74 @ 80924a0)
  1.11 99.92 2.67 readtup_datum 
(tuplesort.c:2224 @ 82517c4)
  1.10102.55 2.64 comparetup_index 
(tuplesort.c:2103 @ 82511e7)

  %   cumulative   self  self total   
 time   seconds   secondscalls   s/call   s/call  name
 28.34 68.0168.01 76025782 0.00 0.00  comparetup_index
 13.56100.5432.53  7148934 0.00 0.00  tuplesort_heap_siftup
  8.66121.3320.79 76025782 0.00 0.00  inlineApplySortFunction
  4.43131.9610.63 13084567 0.00 0.00  AllocSetAlloc
  3.73140.90 8.94 76025785 0.00 0.00  btint4cmp
  2.15146.07 5.17  6095625 0.00 0.00  LWLockAcquire
  2.02150.92 4.85  2960305 0.00 0.00  heapgettup
  1.98155.66 4.74  7148934 0.00 0.00  tuplesort_heap_insert
  1.78159.94 4.28  2960312 0.00 0.00  slot_deform_tuple
  1.73164.09 4.15 readtup_heap
  1.67168.09 4.00  6095642 0.00 0.00  LWLockRelease
  1.53171.76 3.68  2960308 0.00 0.00  index_form_tuple
  1.44175.21 3.45 13083442 0.00 0.00  AllocSetFree
  1.28178.28 3.07  8377285 0.00 0.00  LogicalTapeWrite
  1.25181.29 3.01  8377285 0.00 0.00  LogicalTapeRead
  1.11183.96 2.67 readtup_datum
  1.06186.51 2.551 2.55   123.54  IndexBuildHeapScan

-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do

Re: [HACKERS] effective SELECT from child tables

2005-10-02 Thread Martijn van Oosterhout
On Sun, Oct 02, 2005 at 03:57:37PM +0300, Hannu Krosing wrote:
 On P, 2005-10-02 at 01:24 -0400, Greg Stark wrote:
  Of course that only works if the reason they want to set fill the rows with
  the default value isn't precisely because NULL is a perfectly reasonable 
  thing
  for the column to have (but not what they want for the existing rows).
 
 Sure.
 
 What would be needed for adding new colums with default filling would be
 some end-of-tuple marker or stored column count or tuple version nr, and
 then a rule (or just default behaviour) of showing default value for
 *missing* columns (colno  nr of stored columns).

Actually, PostgreSQL does know the number of columns in a tuple. It
would be possible get change heap_getattr to return the default value.

However, from a semantic point of view, it would be a bit strange. If
you added a column, updated some rows then set a default, that default
might end up applying to every row, except the ones you already
modified. With careful coding you may be able to get around this.

However, a good argument can be made that setting the DEFAULT for a
column shouldn't change data anywhere.  What about if I want to change
the default for new values but not for old ones. That wouldn't work if
the database starts adding values randomly, depending on when they are
read...

Have a nice day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgpxc3tWm4D4W.pgp
Description: PGP signature


Re: [HACKERS] Socket problem using beta2 on Windows-XP

2005-10-02 Thread Martijn van Oosterhout
On Sun, Oct 02, 2005 at 12:20:05PM +0200, Thomas Hallgren wrote:
 I added some traces to the code. I know that the following happens when 
 I start a postmaster.

snip

 In the second iteration of ServerLoop, pgstat_forkexec will again call  
 will call internal_fork_exec. This time it fails.
 According to the log it fails on line:
 
write_inheritable_socket(param-pgStatSock, pgStatSock, childPid);

Well, pgStatSock is the only SOCK_DGRAM socket, all the others are
SOCK_STREAM, maybe that's the difference? It's also connected to
itself, although for DGRAM sockets that's not that special.

The documentation isn't totally clear about this. Yet the error thrown
should terminate the process, yet it obviously isn't. Very odd.

Any Windows programmers with ideas?
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgpKbZv1lHAcM.pgp
Description: PGP signature


Re: [HACKERS] [PERFORM] A Better External Sort?

2005-10-02 Thread Martijn van Oosterhout
Ok, I tried two optimisations:

1. By creating a special version of comparetup_index for single key
integer indexes. Create an index_get_attr with byval and len args. By
using fetch_att and specifying the values at compile time, gcc
optimises the whole call to about 12 instructions of assembly rather
than the usual mess.

2. By specifying: -Winline -finline-limit-1500 (only on tuplesort.c).
This causes inlineApplySortFunction() to be inlined, like the code
obviously expects it to be.

default build (baseline)235 seconds
-finline only   217 seconds (7% better)
comparetup_index_fastbyval4 only221 seconds (6% better)
comparetup_index_fastbyval4 and -finline203 seconds (13.5% better)

This is indexing the integer sequence column on a 2.7 million row
table. The times are as given by gprof and so exclude system call time.

Basically, I recommend adding -Winline -finline-limit-1500 to the
default build while we discuss other options.

comparetup_index_fastbyval4 patch attached per example.
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.
--- pgsql-clean/src/include/access/itup.h   2005-10-02 21:30:20.327464320 
+0200
+++ pgsql-sort/src/include/access/itup.h2005-10-02 16:04:00.0 
+0200
@@ -126,6 +126,34 @@
) \
 )
 
+#define index_get_attr(tup, attnum, tupleDesc, attbyval, attlen, isnull) \
+( \
+   AssertMacro(PointerIsValid(isnull)  (attnum)  0), \
+   *(isnull) = false, \
+   !IndexTupleHasNulls(tup) ? \
+   ( \
+   (tupleDesc)-attrs[(attnum)-1]-attcacheoff = 0 ? \
+   ( \
+   fetch_att((char *) (tup) + 
IndexInfoFindDataOffset((tup)-t_info) \
+   + (tupleDesc)-attrs[(attnum)-1]-attcacheoff, 
attbyval, attlen) \
+   ) \
+   : \
+   nocache_index_getattr((tup), (attnum), (tupleDesc), 
(isnull)) \
+   ) \
+   : \
+   ( \
+   (att_isnull((attnum)-1, (char *)(tup) + 
sizeof(IndexTupleData))) ? \
+   ( \
+   *(isnull) = true, \
+   (Datum)NULL \
+   ) \
+   : \
+   ( \
+   nocache_index_getattr((tup), (attnum), (tupleDesc), 
(isnull)) \
+   ) \
+   ) \
+)
+
 
 /* routines in indextuple.c */
 extern IndexTuple index_form_tuple(TupleDesc tupleDescriptor,
--- pgsql-clean/src/backend/utils/sort/tuplesort.c  2005-09-24 
23:23:39.0 +0200
+++ pgsql-sort/src/backend/utils/sort/tuplesort.c   2005-10-02 
21:29:39.349086302 +0200
@@ -375,6 +375,8 @@
 unsigned int len);
 static int comparetup_index(Tuplesortstate *state,
 const void *a, const void *b);
+static int comparetup_index_fastbyval4(Tuplesortstate *state,
+const void *a, const void *b);
 static void *copytup_index(Tuplesortstate *state, void *tup);
 static void writetup_index(Tuplesortstate *state, int tapenum, void *tup);
 static void *readtup_index(Tuplesortstate *state, int tapenum,
@@ -498,8 +500,12 @@
  int workMem, bool randomAccess)
 {
Tuplesortstate *state = tuplesort_begin_common(workMem, randomAccess);
+   TupleDesc   tupDes = RelationGetDescr(indexRel);
 
-   state-comparetup = comparetup_index;
+   if( tupDes-natts == 1  tupDes-attrs[0]-attbyval == 1  
tupDes-attrs[0]-attlen == 4 )
+   state-comparetup = comparetup_index_fastbyval4;
+   else
+   state-comparetup = comparetup_index;
state-copytup = copytup_index;
state-writetup = writetup_index;
state-readtup = readtup_index;
@@ -2102,6 +2108,92 @@
return 0;
 }
 
+static int
+comparetup_index_fastbyval4(Tuplesortstate *state, const void *a, const void 
*b)
+{
+   /*
+* This is almost the same as _bt_tuplecompare(), but we need to keep
+* track of whether any null fields are present.  Also see the special
+* treatment for equal keys at the end.
+*/
+   IndexTuple  tuple1 = (IndexTuple) a;
+   IndexTuple  tuple2 = (IndexTuple) b;
+   Relationrel = state-indexRel;
+   ScanKey scankey = state-indexScanKey;
+   TupleDesc   tupDes;
+   boolequal_hasnull = false;
+
+   tupDes = RelationGetDescr(rel);
+
+   ScanKey entry = scankey[0];
+   Datum   datum1,
+   datum2;
+   boolisnull1,
+   isnull2;
+   int32   compare;
+
+   datum1 = index_get_attr(tuple1, 1, tupDes, 1, 4, isnull1

Re: [HACKERS] effective SELECT from child tables

2005-10-02 Thread Martijn van Oosterhout
On Sun, Oct 02, 2005 at 11:51:27PM +0300, Hannu Krosing wrote:
 Right. Actually the default value returned for missing columns should
 be different from the default for new values and should be settable only
 once, when adding the column, else issues would become really really
 weird.

Right, the only way I could really imagine it working is have a flag,
attautodefault which if set would return the default instead of NULL.
It would only ever be set if the ADD COLUMN SET DEFAULT happened
together. But does this mean you have one magic default fixed for the
life of the column and the actual default which can be changed anytime?
Seems messy, though possible...

Have a nice day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgpf0P5jSQf7Z.pgp
Description: PGP signature


Re: [HACKERS] PG Killed by OOM Condition

2005-10-03 Thread Martijn van Oosterhout
On Mon, Oct 03, 2005 at 11:03:06PM +1000, John Hansen wrote:
 Might it be worth while protecting the postmaster from an OOM Kill on
 Linux by setting /proc/{pid}/oom_adj to -17 ?
 (Described vaguely in mm/oom_kill.c)

Has it actually happened to you? PostgreSQL is pretty good about its
memory usage. Besides, seems to me it should be an system admisitrator
descision.

Have a nice day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgpZ0ko2iHAwg.pgp
Description: PGP signature


Re: [HACKERS] Tuning current tuplesort external sort code for 8.2

2005-10-03 Thread Martijn van Oosterhout
On Mon, Oct 03, 2005 at 09:35:30PM +0100, Simon Riggs wrote:
 Based upon profiling of the initial stage of external sorting, it seems
 that this stage is overall CPU bound, with hotspots in comparetup_*
 accounting for around 50% of CPU time; lets just call that too much,
 since your exact experience may vary.

Indeed, however as I pointed out, if you arrange for
inlineApplySortFunction() actually be inlined, you can cut costs,
especially in the index creation case.

snip
 values from that could be cached ready for the next call. Caching would
 reduce number of *_getattr calls from 2N to N+1, where N is likely to go

My profiling indicates that the second getattr is half the cost of the
first, gcc optimisation at work. Note that setting CFLAGS=-pg for
configure disables optimisation, I missed that the first time.
Ofcourse, every call saved is time saved.

 2. In comparetup_ the second attr value is always fetched, even when the
 first attr is null. When the first attr is null the value of the second
 need never be checked, just whether the second attr is null or not, so
 the full cost of the *_getattr need not actually be paid at all. The
 relevance of this is not reduced as a result of the caching suggested in
 (1).

Actually, attribute is null is the cheap case because you only need to
check the bitmap. But you could optimise stuff by expanding the
*_getattr calls and optimising directly. Possible problem with caching:
if you're called by the system qsort, can you assume anything about the
order of the comparisons?

Please note: if inlineApplySortFunction() is actually inlined (it isn't
by default), gcc does get very smart about this and sometimes optimises
out the Datum fetches depending on the isNull flags. So we need to
check we're actually making an improvement over the compiler.

snip

 is a subset of the PK (a typical one-many relationship) and groupings
 also. In the majority of cases, these attrs are at the start of a tuple.
 The *_getattr macros are particularly poor at handling NULLs. When
 *_getattr sees *any* NULL is present for a tuple it checks the
 nullability of all attrs up to the current attrnum before returning
 using the cached offsets. The macro could be altered so that if the
 current attrnum  firstNullableAttrnum (which we can set once for the

Maybe easier, in the macro use: bitmap  ((1attnum)-1) to quickly
check that no nulls precede the value we're looking for and hence we
can use the fast path anyway. Along the lines of:

#define index_getattr(tup, attnum, tupleDesc, isnull) \
( \
AssertMacro(PointerIsValid(isnull)  (attnum)  0), \
*(isnull) = false, \
!IndexTupleHasNulls(tup) || (attnum  32  (NullBitmap(tup)  
((1attnum)-1)) == 0 ) ? \
( \
(tupleDesc)-attrs[(attnum)-1]-attcacheoff = 0 ? \
 

Nice ideas though, a seperate run just for NULL keys is interesting. If
you only have one sort key it becomes a whole tape which doesn't need
to be sorted anymore, just emit it at the beginning or end. Could be
helpful.

Mind you, if you start creating seperate routines for different cases
you can go a long way. Elsewhere on this list I created a special case
for single-key integer index columns and got an 8% speed increase. Not
exactly a viable solution though.

Have a nice day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgpWTkP5cRja6.pgp
Description: PGP signature


Re: [HACKERS] PG Killed by OOM Condition

2005-10-04 Thread Martijn van Oosterhout
On Mon, Oct 03, 2005 at 11:47:57PM -0700, Jeff Davis wrote:
 I think that I've run into the OOM killer without a fork() being
 involved, but I could be wrong. Is it possible to be hit by the OOM
 killer if no applications use fork()?

fork() is the obvious overcomitter. If Netscape wants to spawn a new
process, it first has to fork 50MB of memory, then free probably most
of it because it execs some little plugin. If processes mmap() a large block
and then doesn't use it until later. Similar idea with brk(). If you
run out of swap at the wrong moment... Recent versions are more clever
about who to kill. Sometimes you just get unlucky...

It's always killed the right process for me (Mozilla derivative leaked
masses of memory over long period).
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgppqV21TxiXq.pgp
Description: PGP signature


Re: [HACKERS] Tuning current tuplesort external sort code for 8.2

2005-10-04 Thread Martijn van Oosterhout
On Tue, Oct 04, 2005 at 12:37:51AM +0100, Simon Riggs wrote:
 On Mon, 2005-10-03 at 23:25 +0200, Martijn van Oosterhout wrote:
  Please note: if inlineApplySortFunction() is actually inlined (it isn't
  by default)
 
 Can you explain your last post some more. Thats not what I get.

The inline keyword is just a flag that you would like the compiler to
inline it. GCC will decide itself. It has a limit on the size of
functions that it will consider for inlining.

Quote the gcc manual:

-finline-limit-N
 By default, gcc limits the size of functions that can be inlined.
 This flag allows the control of this limit for functions that are
 explicitly marked as inline (ie marked with the inline keyword or
 defined within the class definition in c++).  N is the size of
 functions that can be inlined in number of pseudo instructions
 (not counting parameter handling).  

It goes in to say that the limit is 1 for gcc 2.95, but if you
examine the manual for gcc 3.3 it has the limit at 600. So it's
entirely possible that at the time the person wrote that code, it *was*
being inlined, but it sure isn't on some versions of some compilers. I
experimented and found that -finline-limit-1500 causes it to start
inlining.

The -Winline flag causes gcc to print a warning if you specified
inline but gcc didn't inline it, for whatever reason.

Hopefully this clears that up.
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgpSmmXmACi5l.pgp
Description: PGP signature


Re: [HACKERS] [PERFORM] A Better External Sort?

2005-10-04 Thread Martijn van Oosterhout
On Mon, Oct 03, 2005 at 10:51:32PM +0100, Simon Riggs wrote:
  Basically, I recommend adding -Winline -finline-limit-1500 to the
  default build while we discuss other options.
 
 I add -Winline but get no warnings. Why would I use -finline-limit-1500?
 
 I'm interested, but uncertain as to what difference this makes. Surely
 using -O3 works fine?

Different versions of gcc have different ideas of when a function can
be inlined. From my reading of the documentation, this decision is
independant of optimisation level. Maybe your gcc version has a limit
higher than 1500 by default.
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgppgbguCvZK1.pgp
Description: PGP signature


Re: [HACKERS] Tuning current tuplesort external sort code for 8.2

2005-10-04 Thread Martijn van Oosterhout
On Tue, Oct 04, 2005 at 11:55:58AM +0200, Martijn van Oosterhout wrote:
 It goes in to say that the limit is 1 for gcc 2.95, but if you
 examine the manual for gcc 3.3 it has the limit at 600. So it's
 entirely possible that at the time the person wrote that code, it *was*
 being inlined, but it sure isn't on some versions of some compilers. I
 experimented and found that -finline-limit-1500 causes it to start
 inlining.

From searching the web, it appears the inline limit was dropped from
1 to 600 between gcc 3.0.0 and 3.0.1 in response to complaints
about gcc memory usage. Any function that could be inlined needed to be
kept in memory in semicompiled form and in C++ where lots of inlinable
functions call eachother, the memory usage blew up completely.

The difference between -O2 and -O3 is that the latter will consider any
function for inlining, even if you didn't ask. For C programs that
basically means any function declared static. Also, the number is
pseudo-instructions and their meaning can change from version to
version.

Since we're pretty much relying on gcc to inline for performance, I
still think we should add -Winline by default so we can tell when it's
not doing what we want.

Hope this helps,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgpkTvFoZ4PXA.pgp
Description: PGP signature


Re: [HACKERS] memory bug debugging

2005-10-04 Thread Martijn van Oosterhout
On Tue, Oct 04, 2005 at 01:11:41PM +0200, Markus Schiltknecht wrote:
 Hello hackers,
 
 I'm fiddling around with the backend and have created a memory bug. I
 guess I'm overriding a palloced chunk, but can't figure out where.

There are some defines (MEMORY_CONTEXT_CHECKING and
CLOBBER_FREED_MEMORY) which can help find leaks. Valgrind works too,
with a bit of attention. I was going to add some directives to allow
Valgrind to handle PostgreSQL's memory allocator, but have got there
yet.

-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgpM1s4YvBPVo.pgp
Description: PGP signature


Re: [HACKERS] [PERFORM] A Better External Sort?

2005-10-04 Thread Martijn van Oosterhout
On Tue, Oct 04, 2005 at 12:24:54PM +0100, Simon Riggs wrote:
 How did you determine the 1500 figure? Can you give some more info to
 surround that recommendation to allow everybody to evaluate it?

[EMAIL PROTECTED]:~/dl/cvs/pgsql-local/src/backend/utils/sort$ gcc 
-finline-limit-1000 -Winline -O2 -Wall -Wmissing-prototypes -Wpointer-arith 
-Wendif-labels -fno-strict-aliasing -g -I../../../../src/include -D_GNU_SOURCE  
 -c -o tuplesort.o tuplesort.c
tuplesort.c: In function 'applySortFunction':
tuplesort.c:1833: warning: inlining failed in call to 'inlineApplySortFunction'
tuplesort.c:1906: warning: called from here
tuplesort.c: In function 'comparetup_heap':
tuplesort.c:1833: warning: inlining failed in call to 'inlineApplySortFunction'
tuplesort.c:1937: warning: called from here
tuplesort.c: In function 'comparetup_index':
tuplesort.c:1833: warning: inlining failed in call to 'inlineApplySortFunction'
tuplesort.c:2048: warning: called from here
tuplesort.c: In function 'comparetup_datum':
tuplesort.c:1833: warning: inlining failed in call to 'inlineApplySortFunction'
tuplesort.c:2167: warning: called from here
[EMAIL PROTECTED]:~/dl/cvs/pgsql-local/src/backend/utils/sort$ gcc 
-finline-limit-1500 -Winline -O2 -Wall -Wmissing-prototypes -Wpointer-arith 
-Wendif-labels -fno-strict-aliasing -g -I../../../../src/include -D_GNU_SOURCE  
 -c -o tuplesort.o tuplesort.c
no warnings

A quick binary search puts the cutoff between 1200 and 1300. Given
version variation I picked a nice round number, 1500.

Ugh, that's for -O2, for -O3 and above it needs to be 4100 to work.
Maybe we should go for 5000 or so.

I'm using: gcc (GCC) 3.3.5 (Debian 1:3.3.5-13)

Have a nice day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgpadXPF53tUp.pgp
Description: PGP signature


Re: [HACKERS] [PERFORM] A Better External Sort?

2005-10-04 Thread Martijn van Oosterhout
On Tue, Oct 04, 2005 at 10:06:24AM -0400, Tom Lane wrote:
 Martijn van Oosterhout kleptog@svana.org writes:
  I'm using: gcc (GCC) 3.3.5 (Debian 1:3.3.5-13)
 
 I don't know what the units of this number are, but it's apparently far
 too gcc-version-dependent to consider putting into our build scripts.
 Using gcc version 4.0.1 20050727 (current Fedora Core 4 compiler) on
 i386, and compiling tuplesort.c as you did, I find:
   -O2: warning goes away between 800 and 900
   -O3: warning is always there (tried values up to 1000)
 (the latter behavior may indicate a bug, not sure).

Facsinating. The fact that the warning goes away if you don't specify
-finline-limit seems to indicate they've gotten smarter. Or a bug.
We'd have to check the asm code to see if it's actually inlined or
not.

Two options:
1. Add -Winline so we can at least be aware of when it's (not) happening.
2. If we can't get gcc to reliably inline, maybe we need to consider
other options?

In particular, move the isNull test statements out since they are ones
the optimiser can use to best effect.

Add if we put in -Winline, it would be visible to users while
compiling so they can tweak their own build options (if they care).
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgppKVdbv5luV.pgp
Description: PGP signature


Re: [HACKERS] [PERFORM] A Better External Sort?

2005-10-04 Thread Martijn van Oosterhout
On Tue, Oct 04, 2005 at 03:56:53PM +0100, Simon Riggs wrote:
 I've been using gcc 3.4 and saw no warning when using either -Winline
 or -O3 -Winline.

Ok, I've just installed 3.4 and verified that. I examined the asm code
and gcc is inlining it. I concede, at this point just throw in -Winline
and monitor the situation.

As an aside, the *_getattr calls end up a bit suboptimal though. It's
producing code like:

  cmp attlen, 4
  je $elsewhere1
  cmp attlen, 2
  je $elsewhere2
  ld byte
here:

--- much later ---
elsewhere1:
  ld integer
  jmp $here
elsewhere2:
  ld short
  jmp $here

No idea whether we want to go down the path of hinting to gcc which
size will be the most common.

Have a nice day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgp4B0wVNkzsI.pgp
Description: PGP signature


  1   2   3   4   5   6   7   8   9   10   >