Re: [HACKERS] [GENERAL] Undetected corruption of table files

2007-08-28 Thread Lincoln Yeoh

At 11:48 PM 8/27/2007, Trevor Talbot wrote:

On 8/27/07, Jonah H. Harris [EMAIL PROTECTED] wrote:
 On 8/27/07, Tom Lane [EMAIL PROTECTED] wrote:
  that and the lack of evidence that they'd actually gain anything

 I find it somewhat ironic that PostgreSQL strives to be fairly
 non-corruptable, yet has no way to detect a corrupted page.  The only
 reason for not having CRCs is because it will slow down performance...
 which is exactly opposite of conventional PostgreSQL wisdom (no
 performance trade-off for durability).

But how does detecting a corrupted data page gain you any durability?
All it means is that the platform underneath screwed up, and you've
already *lost* durability.  What do you do then?


The benefit I see is you get to change the platform underneath 
earlier than later.


Whether that's worth it or not I don't know - real world stats/info 
would be good.


Even my home PATA drives tend to grumble about stuff first before 
they fail, so it might not be worthwhile doing the extra work.


Regards,
Link.




---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [GENERAL] [HACKERS] 'a' == 'a '

2005-10-21 Thread Lincoln Yeoh

At 05:33 PM 10/19/2005 -0700, Dann Corbit wrote:

If there is a significant performance benefit to not expanding text 
columns in comparison operations, then it seems it should be OK.


I probably read the standard wrong, but it seems to me that varchar, char, 
and bpchar columns should all behave the same (e.g. if you do not expand 
with blank or the PAD character (whatever that is) then all char type 
columns should behave the same.  I guess that there could be different 
default collations for different column


I am not a DB guru. BUT IMO they should NOT behave the same.

Varchars should NOT be padded.

For the very reason when you select text out of varchar fields the result 
is not padded.


If I insert a string with a single trailing space into a varchar, I _want_ 
that single trailing space to still be there when I retrieve it, and not 
followed by more spaces. Otherwise I will have to pick a different database ;).


So similarly, I would expect that varchars 'a ' and 'a' when compared 
should be different.


However, in the case of _chars_ which are padded, then 'a ' should be 
padded so that it can be compared with 'a '.


Otherwise there will be no reason to do equality comparisons of char(5) 
fields with char(8) fields - they can NEVER be the same :).


But would that mean that when one does equality comparisons of varchars 
with chars, one would probably want padding? Or only varchars of the same 
length as the char would have a chance of matching?


Hmm.. I think I better leave this one to the DB gurus :). But I really 
don't ever want 'a ' to be the same as 'a   ' for varchars.


Link.


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] [GENERAL] Solution to UPDATE...INSERT problem

2003-04-03 Thread Lincoln Yeoh
At 05:28 PM 3/27/03 +0800, Christopher Kings-Lynne wrote:
 There's no select * from table where pkey=x for insert; which would
block
 on uncommitted inserts/updates of pkey=x and other selects for
insert/update.
How about user locks?  Isn't there something in contrib/ for that???  I
could do a userlock on the primary key, whether it existed or not?
Depends on your case, whether you can correctly convert your potential 
primary keys into integers to be locked on.

It still requires full cooperation by all relevant apps/clients.

Actually select ... for updates also require cooperation, but it's a 
standard way of doing things,  so apps that don't cooperate can be said to 
be broken :).

Is there a standard for select ... for insert? Or lock table for insert 
where pkey=x?

Regards,
Link.
---(end of broadcast)---
TIP 6: Have you searched our list archives?
http://archives.postgresql.org


Re: [HACKERS] [GENERAL] Solution to UPDATE...INSERT problem

2003-04-03 Thread Lincoln Yeoh
AFAIK the except select won't see other inserts in uncommitted 
transactions. If those transactions are committed you will end up with the 
same problem. You can try it yourself, by manually doing two separate 
transactions in psql.

You either have to lock the whole table, or lock at the application layer. 
Some time back I suggested a lock on arbitrary string feature for 
postgresql for this and various other purposes, but that feature probably 
wouldn't scale in terms of management (it requires 100% cooperation amongst 
all apps/clients involved).

There's no select * from table where pkey=x for insert; which would block 
on uncommitted inserts/updates of pkey=x and other selects for insert/update.

In contrast select ... for update blocks on committed stuff.

Regards,
Link.
At 09:55 AM 3/27/03 +0800, Christopher Kings-Lynne wrote:

Hi Guys,

I just thought I'd share with you guys a very clever solution to the old
'update row.  if no rows affected, then insert the row' race condition
problem.  A guy at my work came up with it.
We were discussing this earlier on -hackers, but no-one could find a
solution that didn't involve locking the entire table around the
update...insert commands.
The problem is that sometimes the row will be inserted by another process
between your update and insert, causing your insert to fail with a unique
constraint violation.
So, say this is the insert:

INSERT INTO table VALUES (1, 'foo');  // 1 is in the primary key column

Rewrite it like this:

INSERT INTO table SELECT 1, 'foo' EXCEPT SELECT 1, 'foo' FROM table WHERE
pkcol=1;
See? So now that INSERT statement will insert the row if it doesn't exist,
or insert zero rows if it does.  You are then guaranteed that your
transaction will not fail and rollback, so you can repeat your update, or do
the insert first and then the update, etc.
Hope that's handy for people,

Chris

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


---(end of broadcast)---
TIP 6: Have you searched our list archives?
http://archives.postgresql.org


Re: [HACKERS] ADTs and embedded sql

2002-06-20 Thread Lincoln Yeoh

At 01:49 PM 6/20/02 +0100, Tony Griffiths(RA) wrote:

a) The client-side programmer has to be responsible for parsing the 
returned string, which could cause problems if the output format of the 
ADT is changed, and

b) The impedance mismatch is much greater than that of the built-in types.

One man's impedance mismatch is another man's layer of abstraction / 
interface :).

Sorry - couldn't resist ;).

Cheerio,
Link.





---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [HACKERS] non-standard escapes in string literals

2002-06-06 Thread Lincoln Yeoh

Yes it's speculation. The implementation at the DB isn't there, neither are 
the associated DBD/JDBC/ODBC drivers for it.

Basically if the fallacies aren't in postgresql _if_ the decision is to 
implement it, I'd be happy.

I was just noting (perhaps superfluously) that backspaces and friends 
(nulls) have been useful for exploiting databases (and other programs). 
Recently at least one multibyte character (0x81a2) allowed potential 
security problems with certain configurations/installations of Postgresql. 
Would switching to the standard cause such problems to be less or more 
likely? Would making it an option make such problems more likely?

Cheerio,
Link.

p.s. Even +++AT[H]cr(remove square brackets and cr = carriage return) 
as data can cause problems sometimes - esp with crappy modems. Once there 
was a site whose EDI metadata had lots of +++ and they were experiencing 
bad connections grin...


At 07:10 PM 6/6/02 +0200, Peter Eisentraut wrote:
Lincoln Yeoh writes:

  However raw control characters can still cause problems in the various
  stages from the source to the DB.

I still don't see why.  You are merely speculating about implementation
fallacies that aren't there.

--
Peter Eisentraut   [EMAIL PROTECTED]



---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [HACKERS] non-standard escapes in string literals

2002-06-04 Thread Lincoln Yeoh

At 09:58 PM 6/4/02 +0200, Peter Eisentraut wrote:
Lincoln Yeoh writes:

  But for the ANSI standard how does one stuff \r\n\t and other control
  characters into the database?
 
  If there's no way other than actually sending the control characters then
  that is a bad idea especially from a security viewpoint.

Why??

Quoting is to help separate data from commands. Though '' is sufficient for 
quoting ' it seems to me not sufficient for control characters.

There could be control characters that cause problems with the DB, and 
people may not be sufficiently aware of potential problems. If you just 
remove the problematic characters, it means you can't store them in the 
database - the db can become less useful.

Whereas with the current way of quoting control characters, if you are 
unsure what to quote, you could safely quote every untrusted character. 
Less chance of things going wrong. Also being able to quote allows you to 
store control characters in the database.

An example of what could go wrong: a RDBMS may treat raw backspaces as part 
of the command stream and not the data, and thus

insert into pics (data) values ('$CGIPARAM')
could become -
insert into pics (data) values('JFIF^H^H^H^H^H^H...^H^H^HUPDATE row 
from IMPORTANT where (rowid='1')
Which is treated as
UPDATE row from IMPORTANT where (rowid='1')

And so a file upload becomes an insiduous alteration of important data.

Hope that helps,
Link.



---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] non-standard escapes in string literals

2002-06-03 Thread Lincoln Yeoh

At 01:20 PM 6/3/02 +0200, Zeugswetter Andreas SB SD wrote:
  for two things, one for escaping single quotes and for escaping standard
  C characters, like \n.  While we can use the standard-supported '' to
  insert single quotes, what should we do with \n?  The problem is
  switching to standard ANSI solution reduces our functionality.

The problem imho is, that this (no doubt in many cases valuable)
feature reduces the functionality from the ANSI SQL perspective.
Consider a field that is supposed to store Windows filenames,
nam_file='C:\node1\resend\b.dat' :-)

Thus I think a GUC to turn off all escaping except '' would be valuable.

With current behaviour 'C:\node1\resend\b.dat' can be quoted as 
'C:\\node1\\resend\\b.dat'

But for the ANSI standard how does one stuff \r\n\t and other control 
characters into the database?

If there's no way other than actually sending the control characters then 
that is a bad idea especially from a security viewpoint.

Cheerio,
Link.


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] SASL, compression?

2002-05-20 Thread Lincoln Yeoh

What are the benefits of SASL+Postgresql compared to Postgresql over plain SSL?

Coz Postgresql already supports SSL right?

Cheerio,
Link.

At 03:11 PM 5/18/02 -0600, Bear Giles wrote:
If it's being used in Sendmail, Cyrus IMAP and OpenLDAP, with preliminary
work (sponsored by Carnegie Mellon University) in supporting it for CVS
and LPRng and possibly SSH I think it's safe to say it's beyond vaporware
at this point.


I'm aware of the various tricks you can do - setting the shell to
/bin/false, requiring RSA authentication and setting the no-tty flag
in the 'known_keys' file, etc., but at the end of the day there are
still extra shell accounts on that system.

SSH tunnels are a good stopgap measure while you add true TLS/SSL
support, but they can't be considered a replacement for that support.

Bear



---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



Re: [HACKERS] SASL, compression?

2002-05-20 Thread Lincoln Yeoh

At 01:11 AM 5/20/02 -0600, Bear Giles wrote:
  What are the benefits of SASL+Postgresql compared to Postgresql over 
 plain SSL?

The anticipated benefit of SASL is that it would replace all of the
current authetication code with a set of standard plugins.  The
authority problem would be reduced to a simple text mapping.

[I'm not a pgsql hacker, so feel free to ignore me :) ]

I can see the benefit of SASL as a standard in public exposed network 
services like email servers (SMTP, POP, IMAP), where you can support 
different email clients which themselves may or may not support SASL and 
may use different SASL libraries.

But for Postgresql - communications is mainly between internal db clients 
(which use the pgsql libraries) and postmaster.

Would the SASL code allow JDBC, Perl DBI+DBD postgresql clients support 
SASL (and encryption) seamlessly? If it would then that's great. If it's 
just psql then not so great.

Because replacing current authentication code doesn't seem as obvious a 
benefit to me. The plugin thing sounds useful tho - modular. But would the 
simple text mapping for authorisation be as simple when UserX is only 
supposed to have SELECT access to certain tables?

To me there may be more bang for the buck by improving support for network 
layer tunnels- like SSL (SASL has more application layer stuff). Maybe even 
support plugins for network layer tunnels, rather than plugins for 
authentication.  Because Postgresql already provides authentication and 
authorisation, we may just need compression/encryption/other tunneling in 
various forms.

Would something like this be possible:
For postgresql clients - standardise on two handles for input and output 
(ala djb's tcpserver), set environment variables, exec/fork a tunnelapp 
with argument string. The tunnelapp will read from output handle, write to 
input handle, and make connection to the tunnelserver (which is where 
things get difficult - postmaster)..

Then you could have an SASL tunnelapp, an SSL tunnelapp, an SSH tunnelapp.

This would be bad for O/Ses with not so good forks support like solaris and 
windows. But the point is - isn't there some other way to abstract the 
network/IO layer stuff so that even recompiles aren't necessary?

So if there's a bug in the tunnel app it's not a Postgresql problem - only 
the tunnel app needs to be fixed.

  Coz Postgresql already supports SSL right?

Postgresql minimally supports SSL.  It contains some significant
coding errors, poor initialization, and no support for client
certificates.  My recent patches should go a long way towards
fixing that.

Cool. WRT the patch which requires strict matches on server hostnames - are 
wildcards allowed or is there an option for the client to ignore/loosen 
things a bit?

Cheerio,
Link.


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] Poster(s) needed

2002-05-18 Thread Lincoln Yeoh

How about the postgresql logo - is there a source vector/postscript of it 
so that he can blow it up without res loss and print it? The logo designer 
may still have the source files.

Cheerio,
Link.

At 02:56 AM 5/18/02 -0300, Marc G. Fournier wrote:

Not that I'm aware of anyone making ...

On Fri, 17 May 2002, Michael Meskes wrote:
  month, I'd like to get some PostgreSQL posters for the booth. But I have
  no idea where to find some.
 
  Do we have that kind of stuff? Or where could I get it? Preferable of 
 course as file so I can print it myself.



---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [HACKERS] pg_dump DROP commands and implicit search paths

2002-05-14 Thread Lincoln Yeoh

At 06:58 AM 5/14/02 +0100, Oliver Elphick wrote:
  retarget a dump script to be reloaded in some other schema.  If the
  dump is cluttered with umpteen thousand copies of the schema name
  that's going to be difficult.

sed -e 's/ old_schema\./ new_schema./g'

I don't think you should allow the dump to be ambiguous for the sake of
making rarely used actions slightly more convenient.

Erm, from what I see on this list, people regularly dump and reload, often 
for performance reasons. There's also dev|backup-production|live.

So I don't think dumping and reloading into another schema would be that 
rare nor should it be difficult.

sed can screw up the data. I suppose we could do schema and data dumps 
separately but :(. Would that actually work tho? Might come in handy one 
not so fine day ;)...

Regards,
Link.


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



Re: [HACKERS] a vulnerability in PostgreSQL

2002-05-02 Thread Lincoln Yeoh

Not tested: but how about the string being
foo'; DROP TABLE T1; foo

Would the last ' be eaten up then resulting in no error?

Also normally a \ would be quoted by \\ right? Would a foo\ result in an 
unquoted \ ? An unquoted backslash may allow some possibilities.

There could be other ways to get rid of the last ', comments etc, so it may 
not be just 6.5.x.

Regards,
Link.

At 05:18 PM 5/2/02 +0900, Tatsuo Ishii wrote:
There is a report from a debian user about a vulnerability in
PostgreSQL pre 7.2. Here is a possible attack scenario which allows to
execute ANY SQL in PostgreSQL.

A web application accepts an input as a part of SELECT qualification
clause. With the user input, the web server program would build a
query for example:

SELECT * FROM t1 WHERE foo = 'input_string_from_user'

Of course above method is too simple, since a user could input a
string such as:

foo'; DROP TABLE t1

To prevent the unwanted SQL statement being executed, the usual method
most applications are taking is quoting ' by \. With this, above
string would be turned into:

foo\'; DROP TABLE t1

which would make it impossible to execute the DROP TABLE statement.
For example in PHP, addslashes() function does the job.

Now, suppose the database encoding is set to SQL_ASCII and the client
encoding is, say, LATIN1 and foo in above string is a latin
character which cannot be converted to ASCII. In this case, PostgreSQL
would produce something like:

(0x81a2)\'; DROP TABLE t1

Unfortunately there was a bug in pre 7.2's multibyte support that
would eat the next character after the
impossible-to-convert-character, and would produce:

(0x81a2)'; DROP TABLE t1

(notice that \ before ' is disappeared)



---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] Search from newer tuples first, vs older tuples first?

2002-05-02 Thread Lincoln Yeoh

At 12:49 AM 5/2/02 -0400, Tom Lane wrote:
Lincoln Yeoh [EMAIL PROTECTED] writes:
  But does Postgresql visit the older tuples first moving to the newer ones,
  or the newer ones first?

It's going to visit them *all*.  Reordering won't improve the
performance.

Ack! I thought it went through them till the first valid tuple and was just 
going the wrong way.

FWIW I think that with the present implementation of btree, the newer
tuples actually will be visited first --- when inserting a duplicate
key, the new entry will be inserted to the left of the equal key(s)
already present.  But it doesn't matter.  The only way to speed this
up is to eliminate some of the visitings, which requires keeping more
info in the index than we presently do.

OK I'm starting to get it :). Will the index behaviour be changed soon?

Hmm, then what are the row tuple forward links for? Why forward?

Regards,
Link.


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [HACKERS] a vulnerability in PostgreSQL

2002-05-02 Thread Lincoln Yeoh

Oops. How about:

foo'; DROP TABLE t1; -- foo

The last ' gets removed, leaving -- (81a2).

So you get:
select ... '(0x81a2)'; DROP TABLE t1; -- (0x81a2)

Would that work? Or do you need to put a semicolon after the --?

Alternatively would select (0x81a2) be a syntax error? If it isn't then 
that's another way to terminate it properly.

As for the backslash, how does postgresql treat \000 and other naughty 
codes? Too bad there are too many characters to backspace over - that is if 
backspacing (\b) over commands works in the first place ;)...

I'll let you know if I think of other ways (I'm sure there are - I probably 
have to go through the postgresql syntax and commands more closely). Got to 
go :).

Cheerio,
Link.

At 05:50 PM 5/2/02 +0900, Tatsuo Ishii wrote:
  Not tested: but how about the string being
  foo'; DROP TABLE T1; foo
 
  Would the last ' be eaten up then resulting in no error?

Even the last ' is eaten up, the remaining string is (81a2), which
would cause parser errors since they are not valid SQL, I think.

  Also normally a \ would be quoted by \\ right? Would a foo\ result in an
  unquoted \ ? An unquoted backslash may allow some possibilities.
 
  There could be other ways to get rid of the last ', comments etc, so it 
 may
  not be just 6.5.x.

Please provide concrete examples. I could not find such that case.
--
Tatsuo Ishii

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
 (send unregister YourEmailAddressHere to [EMAIL PROTECTED])



---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [HACKERS] a vulnerability in PostgreSQL

2002-05-02 Thread Lincoln Yeoh

I hope you won't make this standard practice. Because there are quite 
significant differences that make upgrading from 7.1.x to 7.2 troublesome. 
I can't name them offhand but they've appeared on the list from time to time.

For 6.5.x to 7.1.x I believe there are smaller differences, even so there 
might be people who would patch for security/bug issues but not upgrade. 
I'm still on Windows 95 for instance (Microsoft has stopped supporting it 
tho :( ). I think there are still lots of people on Oracle 7.

Yes support of older software is a pain. But the silver lining is: it's 
open source they can feasibly patch it themselves if they are really hard 
pressed. If the bug report is descriptive enough DIY might not be so bad. 
And just think of it as people really liking your work :).

Any idea which versions of Postgresql have been bundled with O/S CDs?

Regards,
Link.

At 10:23 AM 5/2/02 -0400, Tom Lane wrote:
Tatsuo Ishii [EMAIL PROTECTED] writes:
  Here are the precise conditions to trigger the scenario:

  (1) the backend is PostgreSQL 6.5.x
  (2) multibyte support is enabled (--enable-multibyte)
  (3) the database encoding is SQL_ASCII (other encodings are not
  affected by the bug).
  (4) the client encoding is set to other than SQL_ASCII

  I think I am responsible for this since I originally wrote the
  code. Sorry for this. I'm going to make back port patches to fix the
  problem for pre 7.2 versions.

It doesn't really seem worth the trouble to make patches for 6.5.x.
If someone hasn't upgraded yet, they aren't likely to install patches
either.  (ISTR there are other known security risks in 6.5, anyway.)
If the problem is fixed in 7.0 and later, why not just tell people to
upgrade?

 regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])



Re: [HACKERS] Analyze on large changes...

2002-05-01 Thread Lincoln Yeoh

Hi Tom,

(Please correct me where I'm wrong)

Is it possible to reduce the performance impact of dead tuples esp when the 
index is used? Right now performance goes down gradually till we vacuum 
(something like a 1/x curve).

My limited understanding of current behaviour is the search for a valid 
row's tuple goes from older tuples to newer ones via forward links (based 
on some old docs[1]).

How about searching from newer tuples to older tuples instead, using 
backward links?

My assumption is newer tuples are more likely to be wanted than older ones 
- and so the number of tuples to search through will be less this way.

**If index update is ok.
If a tuple is inserted, the index record is updated to point to inserted 
tuple, and the inserted tuple is made to point to a previous tuple.
e.g.

Index- old tuple-older tuple-oldest tuple
Index- New tuple-old tuple-older tuple-oldest tuple

**if index update not desirable
Index points to first tuple (valid or not).

If a tuple is inserted, the first tuple is updated to point to inserted 
tuple, and the inserted tuple is made to point to a previous tuple.
e.g.

Index- first tuple-old tuple-older tuple-oldest tuple
Index- first tuple- New tuple-old tuple-older tuple-oldest tuple

If this is done performance might not deterioriate as much when using index 
scans right? I'm not sure if a backward links would help for sequential 
scans, which are usually best done forward.

Regards,
Link.

[1] http://developer.postgresql.org/pdf/transactions.pdf
Tuple headers contain:
• xmin: transaction ID of inserting transaction
• xmax: transaction ID of replacing/ deleting transaction (initially NULL)
• forward link: link to newer version of same logical row, if any
Basic idea: tuple is visible if xmin is valid and xmax is not. Valid
means
either committed or the current transaction.
If we plan to update rather than delete, we first add new version of row
to table, then set xmax and forward link in old tuple. Forward link will
be needed by concurrent updaters (but not by readers).

At 10:53 AM 5/1/02 -0400, Tom Lane wrote:
estimates.  [ thinks... ]  Actually I think we might just be
double-counting if we did.  The dead tuples surely should not count as
part of the number of returned rows.  We already do account for the
I/O effort to read them (because I/O is estimated based on the total




---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



[HACKERS] Search from newer tuples first, vs older tuples first?

2002-05-01 Thread Lincoln Yeoh

At 02:10 PM 5/1/02 -0400, Tom Lane wrote:
Lincoln Yeoh [EMAIL PROTECTED] writes:
  My limited understanding of current behaviour is the search for a valid
  row's tuple goes from older tuples to newer ones via forward links

No.  Each tuple is independently indexed and independently visited.
Given the semantics of MVCC I think that's correct --- after all, what's
dead to you is not necessarily dead to someone else.

But does Postgresql visit the older tuples first moving to the newer ones, 
or the newer ones first? From observation it seems to be starting from the 
older ones. I'm thinking visiting the newer ones first would be better. 
Would that reduce the slowing down effect?

Anyway, are you saying:
Index row X entry #1 - oldest tuple
...
Index row X entry #2 - older tuple
...
Index row X entry #3 - old tuple
...
Index row X entry #4 - just inserted tuple

And a search for a valid tuple goes through each index entry and visits 
each tuple to see if it is visible.

That seems like a lot of work to do, any docs/urls which explain this? Are 
the index tuples for the same row generally in the same physical location?

Whereas the following still looks like less work and still compatible with 
MVCC:
index tuple - new tuple - rolled back tuple - old tuple - older tuple.

Just one index tuple per row. The tuples are checked from newer to older 
for visibility via backward links.

The docs I mentioned say updates use the forward links. Repeated updates 
definitely slow down, so backward links might help?

Regards,
Link.




---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])



Re: [HACKERS] Vote totals for SET in aborted transaction

2002-04-26 Thread Lincoln Yeoh

At 10:34 AM 4/26/02 -0400, Tom Lane wrote:
Lincoln Yeoh [EMAIL PROTECTED] writes:
  Coz some things should not be rolled back. So you guys might come up 
 with a
  different keyword for it.

  CONFIG: for non transactional stuff that can appear as SQL statements.
  SET: for stuff that can be transactional.

People keep suggesting this, and I keep asking for a concrete example
where non-rollback is needed, and I keep not getting one.  I can't see

Sorry, I wasn't clear enough. I'm not asking for non-rollback behaviour.

I was trying to say that _IF_ one ever needs to SET stuff that can't be 
rolled back then it may be better to use some other keyword for that feature.

I'm actually for #1 SET being rolled back and to not have any Oracle 
behaviour settings at all. Anything that can't be rolled back shouldn't 
use SET.

  Practical example: Does doing an enable seqscan affect OTHER db 
 connections
  and transactions as well?

There are no SET commands that affect other backends.  (There are
GUC variables with system-wide effects, but we don't allow them to be
changed by SET; rollback or not won't affect that.)

OK.

Cheerio,
Link



---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] Vote totals for SET in aborted transaction

2002-04-26 Thread Lincoln Yeoh

At 11:49 AM 4/26/02 -0400, Tom Lane wrote:
I'm still looking for an example of something that is (a) reasonable
to set on a per-backend basis, and (b) not reasonable to roll back
if it's set in a transaction that fails.

The way I see it is if (a) and you don't want it rolled back, you could put 
it in a transaction of its own.
BEGIN;
SET backend pref;
COMMIT;

And if that transaction fails, maybe it should :).

So other than for performance, the example should also have a reason to 
belong with other statements in a transaction.

Have a nice weekend,
Link.


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] Sequential Scan Read-Ahead

2002-04-25 Thread Lincoln Yeoh

At 12:19 PM 4/25/02 +0900, Curt Sampson wrote:
Grabbing bigger chunks is always optimal, AFICT, if they're not
*too* big and you use the data. A single 64K read takes very little
longer than a single 8K read.

Yes I agree that if sequential scans are done reading ahead helps.

And often doesn't cost much more- whilst waiting for the first block you 
ask for sometimes the other blocks are going to spin past first and often 
the subsystems will read and cache them anyway. At least that was what a 
disk caching program I wrote years ago did (it had a track cache and an O/S 
metadata cache[1]), I'm sure most modern HDDs will do the track caching 
amongst even more advanced stuff.


 3. My observations of OS performance tuning over the past six
 or eight years contradict the statement, There's a considerable
 cost in complexity and code in using raw storage too, and
 it's not a one off cost: as the technologies change, the fast
 way to do things will change and the code will have to be
 updated to match. While optimizations have been removed over
 the years the basic optimizations (order reads by block number,
 do larger reads rather than smaller, cache the data) have
 remained unchanged for a long, long time.

BTW, please don't take me as saying that all control over physical
IO should be done by Postgres. I just think that Posgres could do
a better job of managing data transfer between disk and memory than
the OS can. The rest of the things (using raw paritions, read-ahead,
free-behind, etc.) just drop out of that one idea.

I think the raw partitions will be more trouble than they are worth. 
Reading larger chunks at appropriate circumstances seems to be the low 
hanging fruit.

If postgresql prefers sequential scans so much it should do them better ;) 
(just being naughty!).

Cheerio,
Link.

[1] The theory was the drive typically has to jump around a lot more for 
metadata than for files. In practice it worked pretty well, if I do say so 
myself :). Not sure if modern HDDs do specialized O/S metadata caching 
(wonder how many megabytes would typically be needed for 18GB drives :) ).


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] Vote totals for SET in aborted transaction

2002-04-25 Thread Lincoln Yeoh

At 04:01 PM 4/25/02 -0300, Marc G. Fournier wrote:
  My guess is that we should implement #1 and see what feedback we get in
  7.3.

IMHO, it hasn't been thought out well enough to be implemented yet ... the
options have been, but which to implement haven't ... right now, #1 is
proposing to implement something that goes against what *at least* one of
DBMS does ... so now you have programmers coming from that environment
expecting one thing to happen, when a totally different thing results ...

I don't know about those programmers, but AFAIK when I shift from one DBMS 
to another I expect weird things to happen, because the whole DBMS world is 
filled with all sorts of no standard behaviour.

SET XXX doesn't even directly map to Oracle's stuff in the first place. 
Since it looks different, I think the migrator shouldn't be surprised if it 
works differently. They might expect it to work the same, but if it doesn't 
they'll just go OK yet another one of those.

What would be good are RDBMS X to Postgresql migration docs. I believe 
there's already an Oracle to Postgresql migration document. So putting all 
these things there and linking to them would be helpful.
---

I'm sorry if this has been discussed already:

There may be some SETs which operate on a different level of the 
application. We may wish to clearly differentiate them from those that are 
transactional and can operate in the domain of other SQL statements. Or put 
those in config files and they never appear in SETs?

Coz some things should not be rolled back. So you guys might come up with a 
different keyword for it.

e.g.
CONFIG: for non transactional stuff that can appear as SQL statements.
SET: for stuff that can be transactional.

Practical example: Does doing an enable seqscan affect OTHER db connections 
and transactions as well? If it doesn't then yes it should be 
transactional, whereas if does then it shouldn't bother being 
transactional. And there could well be two cases operating in different 
domains. e.g. CONFIG globalseqscan=0 and SET seqscan=0.

Regards,
Link.


---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [HACKERS] Index Scans become Seq Scans after VACUUM ANALYSE

2002-04-19 Thread Lincoln Yeoh

At 10:48 AM 4/18/02 -0400, mlw wrote:
Bruce Momjian wrote:
 
  Have you tried reducing 'random_page_cost' in postgresql.conf.  That
  should solve most of your problems if you would like more index scans.

My random page cost is 1 :-)

What happens when you set random page cost to 1? Between an index scan of 
50% of a table and a full table scan which would the optimizer pick? With 
it at 1, what percentage would be the switchover point?

Because I'm thinking that for _repeated_ queries when there is caching the 
random page cost for small selections may be very low after the first 
very costly select (may not be that costly for smart SCSI drives). So 
selecting 10% of a table randomly may not be that costly after the first 
select. Whereas for sequential scans 100% of the table must fit in the 
cache. If the cache is big enough then whichever results in selecting less 
should be faster ( noting that typically sequential RAM reads are faster 
than random RAM reads ). If the cache is not big enough then selecting less 
may be better up till the point where the total amount repeatedly selected 
cannot be cached, in which case sequential scans should be better. This is 
of course for queries in serial, not queries in parallel. How would one 
take these issues into account in an optimizer?

Mark's problems with the optimizer seem to be something else tho: 
statistics off.

I had a database where I had to have enable_seqscan=false in the config 
file.
The nature of the data always makes the statistics bogus, and it always 
refused
to use the index.
My one most important experience (I've had more than one) with this whole 
topic
is DMN's music database, when PostgreSQL uses the index, the query executes in
a fraction of a second. When enable_seqscan=true PostgreSQL refuses to use
the index, and the query takes a about a minute. No matter how much I analyze,
I have to disable sequential scan for the system to work correctly.

I'm just wondering why not just use enable_seqscan=false for those 
problematic queries as a hint? Unless your query does need some seq scans 
as well?

By the way, are updates treated the same as selects by the optimizer?

Regards,
Link.


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



Re: [HACKERS] Is this a better MVCC.

2002-04-16 Thread Lincoln Yeoh

On 7.1.x it definitely gets slower even for indexscans. e.g. 60 updates/sec 
dropping to 30 then to 20 over time.

Is this fixed for 7.2?

If not, is it possible to make the pointer point to the latest row instead 
of the most obsolete one, and having the newer rows point to the older 
ones, instead of the other way round (which seems to be happening with 
7.1)? I suppose this could make updates slower - have to update indexes? 
But selects would be faster (other than cases where there are a lot of 
uncommitted updates outstanding).

If that is not possible (or updating the index too painful), how about 
having the first pointer point to first row which then points to latest 
row, which then points to subsequent older rows. That way the miss penalty 
is reduced.

It seems reasonable to me that the newer rows should be more visible- 
unless more people update rows and then rollback rather than update and 
then commit.

I'm missing something out right? :)

Regards,
Link.

At 09:15 AM 4/16/02 -0400, Tom Lane wrote:
mlw [EMAIL PROTECTED] writes:
  Now, what if we did it another way, copy the old version of the row 
 into the
  new row and update the tuple in place?

I don't think we can get away with moving the extant tuple.  If we did,
a concurrent scan that should have found the old tuple might miss it.
(This is why VACUUM FULL needs exclusive lock to move tuples.)

It's fairly unclear whether this would actually buy any performance
gain, anyway.  In the case of a seqscan I don't see that it makes any
difference on average, and in the case of an indexscan what matters is
the index ordering not the physical location.  (In this connection,
btree indexes already do the right thing, cf comments for
_bt_insertonpg.)



---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



[HACKERS] Sorting. Re: Re : Solaris Performance - Profiling (Solved)

2002-04-03 Thread Lincoln Yeoh

Just curious - why is solaris qsort that way? Any good reasons? I saw a 
very old post by a solaris guy, but it didn't seem very convincing.

By the way are there faster sorts which Postgresql can use for its sorting 
other than quick sort? e.g. BSD 4.4 radixsort (which DJB seems to keep 
going on about :)).

Would it make a significant improvement in performance?

Cheerio,
Link.

p.s. We have postgresql on solaris too ;).

At 05:59 PM 4/3/02 +1000, you wrote:
Hi Tom,

How about we include this and have configure somehow ensure the Solaris
users get it automatically?

There are a *bunch* of Solaris users out there.

:-)



---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] Binding PostgreSQL to a specific ip address

2002-03-26 Thread Lincoln Yeoh

Note if you are trying to run more than one postgresql you also have to 
prevent the unix socket files from clashing.

  On Wed, 27 Mar 2002, Alastair D'Silva wrote:
 
   Is there any way to force PostgreSQL to bind to a specific
  IP address?
  
   There doesn't seem to be anything about this in the docs,
  and if its
   not implemented, it would be a useful feature to have (and
  an easy one
   to implement).
  (from runtime-config.html)
 
 VIRTUAL_HOST (string)
Specifies the TCP/IP hostname or address on which the
postmaster is to listen for connections from client
applications. Defaults to listening on all
  configured addresses
(including localhost).
 
  Gavin
 


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] Client/Server compression?

2002-03-17 Thread Lincoln Yeoh

You can also use stunnel for SSL. Preferable to having SSL in postgresql 
I'd think.

Cheerio,
Link.

At 03:38 PM 3/16/02 -0500, Tom Lane wrote:

FWIW, I was not in favor of the SSL addition either, since (just as you
say) it does nothing that couldn't be done with an SSH tunnel.  If I had



---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [HACKERS] storing binary data

2001-10-25 Thread Lincoln Yeoh


I'll take a shot at improving the documentation for bytea. I'm hoping 
documentation patches are accepted during beta though ;-)

Also, FWIW, 7.2 includes bytea support for LIKE, NOT LIKE, LIKE ESCAPE, 
||, trim(), substring(), position(), length(), indexing, and various 
comparators.


Cool!

Would it be practical to use substring for retrieving chunks of binary data
in manageable sizes? Or would the overheads be too high?

Cheerio,
Link.


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] Pre-forking backend

2001-10-16 Thread Lincoln Yeoh

At 10:18 AM 15-10-2001 -0400, Tom Lane wrote:
Lincoln Yeoh [EMAIL PROTECTED] writes:
 Create a small program that makes a few connections to postgresql, does
 some initialization, preconnects to various DBs (or maybe limited to one DB
 specified on startup), and listens on one port/socket. It might not even
 prefork, just cache connections so first connection is slow, subsequent
 ones are cached along with the user-pass for faster authentication. 

 Then your apps can connect to that small program, authenticate, and get the
 relevant connection. Call it a Listener if you want ;).

Couple of problems...

(a) where is this outside program going to get authentication
information from?

Various options:
1) No authentication required by client - authentication supplied on
startup/config.
2) Local authentication - runs as postgres user, reads from postgres files.
3) Local authentication - from config file, mapped to actual remote
authentication
4) Authentication from remote server, then cached in memory.

(b) it seems that not only the authentication exchange, but also all
subsequent data exchange of each connection would have to go through
this additional program.  That middleman is going to become a
bottleneck.

The authentication exchange doesn't happen that often, since the DB
connections are reused - no reconnection.

True it might be a bottleneck. But in certain setups the middleman is not
running on the DB server and thus not using the DB server resources.

---
Are there really compelling reasons for having a preforking backend? What
would the benefits be? Faster connection setup times? Connecting and
disconnecting quickly is important for a webserver because of the HTTP
protocol, but for a DB server? Would it really be fast in cases where
there's authentication and access control to various databases? 

Perhaps it's undesirable for people to roll their own DB connection
pooling. But my worry is that there's such a great diversity that most
people may still have to roll their own DB connection pooling, then a
preforking backend just adds complexity and sucks up a bit more resources
for little gain. 

For example in my case if connection setup times are a problem, I'd just
preconnect and reuse the connections for many transactions. Wouldn't that
still be much faster than a preforking backend? How fast would a preforking
backend be?

Regards,
Link.


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] Pre-forking backend

2001-10-14 Thread Lincoln Yeoh

How would authentication and access control be done with a preforking
backend? I personally find a preforking backend desirable, but that's just me.

But if people really want preforking how about not doing it in the backend.

Create a small program that makes a few connections to postgresql, does
some initialization, preconnects to various DBs (or maybe limited to one DB
specified on startup), and listens on one port/socket. It might not even
prefork, just cache connections so first connection is slow, subsequent
ones are cached along with the user-pass for faster authentication. 

Then your apps can connect to that small program, authenticate, and get the
relevant connection. Call it a Listener if you want ;).

It does mean double the number of processes. But if done decently it is
likely to mean two less complex and less buggy processes, compared to one
more complex process. 

Would the performance be that much lower using this method? There are other
configurations possible with this approach e.g.:

app--unixsocket--listener--SSL--backend on another host.

This configuration should reduce the TCP and SSL connection set up times
over a network.

Could have different types of preforkers. Then if a certain mode gets very
popular and performance is insufficient then it could be appropriate to
move that mode to the backend.

Cheerio,
Link.

At 03:55 PM 13-10-2001 -0400, Bruce Momjian wrote:

I realize this is all pie-in-the-sky but I think we need some connection
pooling capability in the backend someday.  We are fine with Apache and
PHP becuase they can pool themselves but at some point we have too many
clients reinventing the wheel rather than having our backend do it.

Also, this relates to pre-forking backends and does not related to
re-using backends, which is another nice feature we should have someday.

  Added to TODO:
 
 I haven't seen a consensus yet.

True.  I can remove it or improve it.  It is actually:

* Have pre-forked backend pre-connect to last requested database or pass
  file descriptor to backend pre-forked for matching database

which mentions passing file descriptors to backends, which we have
discussed and should be recorded for posterity.



---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])



Re: [HACKERS] Feature suggestion: Postgresql binding to one

2001-10-04 Thread Lincoln Yeoh

At 11:16 PM 03-10-2001 -0400, Tom Lane wrote:
Lincoln Yeoh [EMAIL PROTECTED] writes:
 Is it possible for Postgresql to bind to one IP address? 

See 'virtual_host' GUC parameter.

   regards, tom lane

Thanks!

I'm using a redhat style postgresql init and somehow postgresql seems to
ignore the postgresql.conf file. What's the postmaster.opts file for?

Cheerio,
Link.


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



[HACKERS] Feature suggestion: Postgresql binding to one IP?

2001-10-03 Thread Lincoln Yeoh

Hi people,

Is it possible for Postgresql to bind to one IP address? 

I'm trying to run multiple postgresql installations on one server.

The unix socket could be named accordingly:

Postgresql config bound to a particular port and all IPs.
.s.PGSQL.portnumber 

Postgresql config bound to a particular port and IP.
.s.PGSQL.portnumber.ipaddress

Any other suggestions/comments on running multiple instances of postgresql
are welcomed.

An less desirable alternative is to keep binding to all IP, use different
ports and name the ports, but specifying the port by name in -p doesn't work. 

Cheerio,
Link.


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] Pre-forking backend

2001-09-30 Thread Lincoln Yeoh

At 08:16 PM 30-09-2001 -0600, Steve Wolfe wrote:
 
  How hard would it be to pre-fork an extra backend for the database a
  user just requested so if they next user asks for the same database, the
  backend would already be started?

  Perhaps I'm missing something, but it seems to me that the cost of forking
a new backend would be pretty trivial compared to the expense of processing
anything but the most simple query.  Am I wrong in that?

I think forking costs a lot on Solaris. That's why Sun promotes threads :).

I still don't see many advantages of doing the preforking in postgresql.
What would the benefits be? Able to open and close db connections many
times a second? Any other advantages?

Can't the apps do their own preforking? All they do is preopen their own db
connections. Then they can take care of whatever initialization and details
they want.

It seems that opening and closing db connections over the network will
always be slower than just leaving a prepared connection open, looking at
just the network connection setup time alone.

I suppose it is helpful for plain cgi scripts, but those don't scale do they?

Cheerio,
Link.


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [HACKERS] Pre-forking backend

2001-09-29 Thread Lincoln Yeoh

At 04:50 PM 9/29/01 -0400, Tom Lane wrote:
Bruce Momjian [EMAIL PROTECTED] writes:
 On some operating systems, only one child at a time can accept() on the
 socket.  On these, you have to lock around the call to accept().

 But how do you know the client wants the database you have forked?  They
 could want a different one.

This approach would only work as far as saving the fork() call itself,
not the backend setup time.  Not sure it's worth the trouble.  I doubt
that the fork itself is a huge component of our start time; it's setting
up all the catalog caches and so forth that's expensive.

I don't think there's much benefit as well.

For most cases where preforking would help, you could just simply not
disconnect. Get the app to connect to the correct DB on startup and then
just wait, do stuff then don't disconnect either rollback or commit. Or
have a DB connection pool.

What would be good is a DB that can handle lots of connections well. That
would help almost any case.

Preforking is good for web servers but for DB servers it doesn't seem as
useful.

Cheerio,
Link.


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



Re: [HACKERS] Spinlock performance improvement proposal

2001-09-28 Thread Lincoln Yeoh

At 10:02 AM 9/27/01 -0400, mlw wrote:
D. Hageman wrote:
 I agree with everything you wrote above except for the first line.  My
 only comment is that process boundaries are only *truely* a powerful
 barrier if the processes are different pieces of code and are not
 dependent on each other in crippling ways.  Forking the same code with the
 bug in it - and only 1 in 5 die - is still 4 copies of buggy code running
 on your system ;-)

This is simply not true. All software has bugs, it is an undeniable fact.
Some
bugs are more likely to be hit than others. 5 processes , when one process
hits a
bug, that does not mean the other 4 will hit the same bug. Obscure bugs kill
software all the time, the trick is to minimize the impact. Software is not
perfect, assuming it can be is a mistake.

A bit off topic, but that really reminded me of how Microsoft does their
forking in hardware.

Basically they fork (cluster) FIVE windows machines to run the same buggy
code all on the same IP. That way if one process (machine) goes down, the
other 4 stay running, thus minimizing the impact ;).

They have many of these clusters put together.

See: http://www.microsoft.com/backstage/column_T2_1.htm
From Microsoft.com Backstage [1]

OK so it's old (1998), but from their recent articles I believe they're
still using the same method of achieving 100% availability. And they brag
about it like it's a good thing...

When I first read it I didn't know whether to laugh or get disgusted or
whatever.

Cheerio,
Link.

[1]
http://www.microsoft.com/backstage/
http://www.microsoft.com/backstage/archives.htm



---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



[HACKERS] Anyone tried compiling postgresql with the Intel compilers?

2001-09-18 Thread Lincoln Yeoh

Hi has anyone tried Intel's compiler yet?

http://developer.intel.com/software/products/eval/

Just wondering what would happen.

Cheerio,
Link.


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] Abort state on duplicated PKey in transactions

2001-09-09 Thread Lincoln Yeoh

I had a similar issue.

I needed to make sure I had a unique row- insert if not there, update if
there. 

So I resorted to locking the whole table, then select, then insert/update.

What Tom told me to do was to use lock table tablename in exclusive mode
for my case.

This blocks select for updates, but doesn't block selects.

So you must check with a select for update, then only do the insert if it's
ok.

If you don't check with a select for update it will not block, and bad
things could happen :).

However I couldn't do a for update with an aggregate, so in my
generalised putrow routine I can't use in exclusive mode.

I basically wanted to do a select count(*) from datable where whereclause
for update.

If the count was 0 then only insert, else if 1 update, else make some noise
:).

The alternative is to actually fetch the rows which can be slower.

Regards,
Link.


At 12:20 PM 08-09-2001 -0500, Haroldo Stenger wrote:
transaction should have to be redone if the insertion failed. A
solution, could be to query for the existance of the PK, just before the
insertion. But there is a little span between the test and the
insertion, where another insertion from another transaction could void
the existance test. Any clever ideas on how to solve this? Using
triggers maybe? Other solutions?



---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://www.postgresql.org/search.mpl



[HACKERS] Re: Toast,bytea, Text -blob all confusing

2001-08-28 Thread Lincoln Yeoh

At 03:05 PM 27-08-2001 -0400, Alex Pilosov wrote:
On Thu, 23 Aug 2001 [EMAIL PROTECTED] wrote:

 THIS IS WHAT I CANT SEEM TO FIGURE OUT IN POSTGRESQL
 1. I cant get a clear answer on what kind of data type to use for my large
 text string?  TEXT, ???, ??? or something about TOAST
 I have seen in the e-mail archive but cant find any documentaion?
I would suggest bytea or blob. Blobs are well-documented in normal
documentation and in documentation of your favorite interface, so I'll
just talk about bytea.

 2. I've written my own escape method ,(cant find one for Pgsql) , BUT i
 don't know what 
 to escape and not to escape. So it keeps failing. I cand find any docs. on
 what to escape either?
For bytea, follow this rule: to escape a null character, use this:
'\\0'. To escape a backslash, use this: ''.

Same idea to unescape data.

Are there other characters that need to be escaped? I suspect there are
more characters that need to be escaped - ctrl chars? single quotes?. Why
four backslashes for one? Is there a definitive documentation anywhere for
what bytea is _supposed_ (not what it might actually be) to be and how it
is to be handled?

Also why wouldn't escaping stuff like this work with TEXT then? If a null
is going to be backslash backslash zero, and come out the same way, it sure
looks like TEXT to me :). OK so there's this thing about storage. So maybe
I could save a byte by just converting nulls to backslash zero and real
backslashes to backslash backslash. Tada.  

OK it's probably not the same, but having to put four backslashes when two
should be enough to quote one makes me rather puzzled and uneasy. 

Cheerio,
Link.


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



[HACKERS] Re: Bytea/Base64 encoders for libpq - interested?

2001-08-28 Thread Lincoln Yeoh

At 11:55 AM 28-08-2001 +0200, Karel Zak wrote:

 What implement base64 PostgreSQL datetype that use externaly base64 and
internaly same things as bytea. It prevent FE and parser problems with
bad chars and internaly for data storage save less space than text
with base64. Of course it doesn't solve a problem with encoding/decoding 
data in your application to/from base64. May be implement for this
datetype cast to/from bytea too.

 SELECT my_bytea::base64 FROM foo;

 INSERT INTO foo (my_bytea) VALUES ('some_base64_string'::bytea);

 And you can still fetch all data directly in batea by binary cursor. 

 Comments?

Sounds good to me. Even better if the base64 parser is bulletproof and
tolerant of junk. That way base64 email attachments may not even need to be
processed much - just filter a bit and shove it in :).

But shouldn't there be a ::base64 somewhere in the insert statement?

Cheerio,
Link.


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])



[HACKERS] Re: Bytea/Base64 encoders for libpq - interested?

2001-08-28 Thread Lincoln Yeoh

At 11:55 AM 28-08-2001 +0200, Karel Zak wrote:
 What implement base64 PostgreSQL datetype that use externaly base64 and
internaly same things as bytea. It prevent FE and parser problems with

Another point:

I have no problems with base64[1]. However I was thinking that it might be
far easier for the C/C++/Java (and other low level languages) bunch to do
hexadecimal. e.g. zero zero for null, zero A for line feed. 

It expands things in the input/output stream, but it might be worth some
consideration. Simplicity, cpu usage etc.

Cheerio,
Link.

[1] OK, I can't convert base64 to ASCII mentally yet. But I don't think
that should really a factor.



---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



[HACKERS] RE: User locks code

2001-08-21 Thread Lincoln Yeoh

At 09:39 AM 20-08-2001 -0700, Mikheev, Vadim wrote:
 If it does then one of the things I'd use it for is to insert
 unique data without having to lock the table or rollback on
 failed insert (unique index still kept as a guarantee).

(Classic example how could be used SAVEPOINTs -:))

I guess so. But this could be faster.

So, in your application you would first lock a key in excl mode
(for duration of transaction), than try to select and insert unless
found? (Note that this will not work with serializable isolevel.)

yep:
lock tablename.colname.val=1
select count(*) from tablename where colname=1
If no rows, insert, else update.
(dunno if the locks would scale to a scenario with hundreds of concurrent
inserts - how many user locks max?).

Why wouldn't it work with serializable isolevel?

Anyway, I believe that isolevel doesn't really serialise things in this
case (inserting a unique row) so it wouldn't matter to me.

Regards,
Link.


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



[HACKERS] RE: User locks code

2001-08-20 Thread Lincoln Yeoh

At 11:20 AM 8/19/01 -0700, Vadim Mikheev wrote:
Well, ability to lock only unlocked rows in select for update is useful,
of course. But uniq features of user'locks are:

1. They don't interfere with normal locks hold by session/transaction.
2. Share lock is available.
3. User can lock *and unlock objects* inside transaction, which is not
(and will not be) available with locks held by transactions.

Would your suggested implementation allow locking on an arbitrary string?

If it does then one of the things I'd use it for is to insert unique data
without having to lock the table or rollback on failed insert (unique index
still kept as a guarantee).

Cheerio,
Link.





---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])



[HACKERS] Re: Re: Notes about int8 sequences

2001-08-06 Thread Lincoln Yeoh

At 07:02 PM 06-08-2001 -0400, Tom Lane wrote:
pseudo-type should generate an int8 instead of int4 column.  On
compatibility grounds, it might be better to leave it generating int4,
and invent a second pseudo-type SERIAL8 that is just the same except
for making an int8 column.  I'm more worried about changing the datatype
of a user column than I am about changing the output type of nextval(),
so I'd be sort of inclined to have two SERIAL types even if we change
nextval() to int8.  Thoughts?

serial8 sounds ok to me.

I use currval.

Cheerio,
Link.


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])



[HACKERS] Re: OID wraparound (was Re: pg_depend)

2001-07-18 Thread Lincoln Yeoh

At 06:10 PM 18-07-2001 -0400, Lamar Owen wrote:
applications :-) I guess I'll just need to switch to proper SERIALs and 
PRIMARY KEYs.  Of course, if I wanted to be stubborn, I'd just use the GUC 
option to enable OIDs system-wide by default

The default 32 bit serial primary key isn't immune to roll overs either.

I doubt it'll affect my stuff, but it'll affect others.

Once you talk about storing petabytes or terabytes of data, 32 bits might
not be enough.

Cheerio,
Link.


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



[HACKERS] Re: [GENERAL] Vacuum and Transactions

2001-07-07 Thread Lincoln Yeoh

At 05:59 PM 7/6/01 -0400, Bruce Momjian wrote:

OK, I just talked to Tom on the phone and here is his idea for 7.2.  He
says he already posted this, but I missed it.

His idea is that in 7.2 VACUUM will only move rows within pages.  It
will also store unused space locations into shared memory to be used by
backends needing to add rows to tables.  Actual disk space compaction
will be performed by new a VACUUM FULL(?) command.

The default VACUUM will not lock the table but only prevent the table
from being dropped.

Would 7.2 maintain performance when updating a row repeatedly (update,
commit)? Right now performance goes down in a somewhat 1/x manner. It's
still performs ok but it's nice to have things stay blazingly fast.

If not will the new vacuum restore the performance? 

Or will we have to use the VACUUM FULL?

Thanks,
Link.


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



[HACKERS] Re: New Linux xfs/reiser file systems

2001-05-05 Thread Lincoln Yeoh

At 02:09 AM 5/4/01 -0500, Thomas Swan wrote:
 I think it's worth noting that Oracle has been petitioning the
 kernel developers for better raw device support: in other words,
 the ability to write directly to the hard disk and bypassing the
 filesystem all together.   

But there could be other reasons why Oracle would want to do raw stuff.

1) They have more things to sell - management modules/software. More
training courses. Certified blahblahblah. More features in brochure.
2) It just helps make things more proprietary. Think lock in.

All that for maybe 10% performance increase?

I think it's more advantageous for Postgresql to keep the filesystem layer
of abstraction, than to do away with it, and later reinvent certain parts
of it along with new bugs.

What would be useful is if one can specify where the tables, indexes, WAL
and other files go. That feature would probably help improve performance
far more. 

For example: you could then stick the WAL on a battery backed up RAM disk.
How much total space does a WAL log need?

A battery backed RAM disk might even be cheaper than Brand X RDBMS
Proprietary Feature #5.

Cheerio,
Link.


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



[HACKERS] Re: scaling multiple connections

2001-04-27 Thread Lincoln Yeoh

At 08:39 AM 26-04-2001 -0400, mlw wrote:
I am getting a bit concerned about Postgres 7.1 performance with multiple
connections. Postgres does not seem to scaling very well. Below there is a
list
of outputs from pgbench with different number of clients, you will see that

My postmaster start line looks like:
/usr/local/pgsql/bin/postmaster -A0 -N 24 -B 4096 -i -S -D/sqlvol/pgdev -o -F
-fs -S 2048

Maybe it's the -fs in your start up line.

I tried a similar start line as yours but without -fs and I get consistent
tps values for pgbench.

./pgbench -v  -c 1  -t 30 test
starting vacuum...end.
starting full vacuum...end.
transaction type: TPC-B (sort of)
scaling factor: 1
number of clients: 1
number of transactions per client: 30
number of transactions actually processed: 30/30
tps = 161.938949(including connections establishing)
tps = 180.060140(excluding connections establishing)
[lylyeoh@nimbus pgbench]$ ./pgbench -v  -c 3  -t 30 test
starting vacuum...end.
starting full vacuum...end.
transaction type: TPC-B (sort of)
scaling factor: 1
number of clients: 3
number of transactions per client: 30
number of transactions actually processed: 90/90
tps = 172.909666(including connections establishing)
tps = 189.845782(excluding connections establishing)
[lylyeoh@nimbus pgbench]$ ./pgbench -v  -c 4  -t 30 test
starting vacuum...end.
starting full vacuum...end.
transaction type: TPC-B (sort of)
scaling factor: 1
number of clients: 4
number of transactions per client: 30
number of transactions actually processed: 120/120
tps = 172.909417(including connections establishing)
tps = 189.319538(excluding connections establishing)

Tested machine is a Dell Poweredge 1300 uniprocessor PIII 500MHz with 128MB
RAM, and a single 9GB HDD.

With -fs there's a decrease, but not as marked as your case. So not sure if
it's really the problem.

Try that out.

Cheerio,
Link.


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])



[HACKERS] Re: The new, the improved ... FTS Searching of Mailing List Archives

2001-04-27 Thread Lincoln Yeoh

At 03:44 PM 27-04-2001 -0300, The Hermit Hacker wrote:
On Fri, 27 Apr 2001, Bruce Momjian wrote:

  On Fri, 27 Apr 2001, Bruce Momjian wrote:
 
  Huh? *raised eyebrow*  This is a standalone application that they've
  donated to the project ... nothing that can be added to any of our
  distributions ...

 Isn't the text indexing something that can go into the distribution?

to the best of my knowledge, everything they had for public consumption
was added to v7.1, but Oleg would be better for that ... to get
fts.postgresql.org, there was nothing special I had to do as far as the
backend was concerned *shrug*

featurerequest
Well if stuff like that ends up in Postgresql would it be possible to index
LIKE '%xxx%' searches? That way all people have to do is create the
relevant index and use a fts_ops or something, and voila LIKE '%xxx%'
searches become faster, with maybe some performance+disk space hit for
inserts.

Would something like that be difficult to implement? I'm not sure how
function+fts index would work either.

I hope FTS for postgresql doesn't start looking like Oracle's
Context/Intermedia... Proprietary interfaces == lock in == ick.
/featurerequest

Cheerio,
Link.


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



[HACKERS] Re: Re: Re: refusing connections based on load ...

2001-04-24 Thread Lincoln Yeoh

At 11:28 PM 24-04-2001 -0300, The Hermit Hacker wrote:

I have a Dual-866, 1gig of RAM and strip'd file systems ... this past
week, I've hit many times where CPU usage is 100%, RAM is 500Meg free and
disks are pretty much sitting idle ...

It turns out, in this case, that vacuum was in order (i vacuum 12x per day
now instead of 6), so that now it will run with 300 simultaneous
connections, but with a loadavg of 68 or so, 300 connections are just
building on each other to slow the rest down :(


Hmm then maybe we should refuse connections based on need to vacuum... :).

Seriously though does the _total_ work throughput go down significantly
when you have high loads? 

I got a load 13 with 25 concurrent connections (not much), and yeah things
took longer but the hits per second wasn't very much different from the
peak possible with fewer connections. Basically in my case almost the same
amount of work is being done per second.

So maybe higher loads might be fine on your more powerful system?

Cheerio,
Link.


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



[HACKERS] Re: refusing connections based on load ...

2001-04-23 Thread Lincoln Yeoh

At 03:09 PM 23-04-2001 -0300, you wrote:

Anyone thought of implementing this, similar to how sendmail does it?  If
load  n, refuse connections?

Basically, if great to set max clients to 256, but if load hits 50 as a
result, the database is near to useless ... if you set it to 256, and 254
idle connections are going, load won't rise much, so is safe, but if half
of those processes are active, it hurts ...

Sorry, but I still don't understand the reasons why one would want to do
this. Could someone explain?

I'm thinking that if I allow 256 clients, and my hardware/OS bogs down when
60 users are doing lots of queries, I either accept that, or figure that my
hardware/OS actually can't cope with that many clients and reduce the max
clients or upgrade the hardware (or maybe do a little tweaking here and
there).

Why not be more deterministic about refusing connections and stick to
reducing max clients? If not it seems like a case where you're promised
something but when you need it, you can't have it. 

Cheerio,
Link.




---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])



[HACKERS] Re: Re: Hey guys, check this out.

2001-04-16 Thread Lincoln Yeoh

At 08:38 PM 15-04-2001 -0700, you wrote:
On Sun, Apr 15, 2001 at 10:05:46PM -0400, Vince Vielhaber wrote:
 On Mon, 16 Apr 2001, Lincoln Yeoh wrote:
 
  Maybe you guys should get some Great Bridge marketing/PR person to handle
  stuff like this.
 
 After reading Ned's comments I figured that's how it got that way in
 the first place.  But that's just speculation.

You probably figured wrong.  

All those publications have editors who generally feel they're not 
doing their job if they don't introduce errors, usually without even 
talking to the reporter.  That's probably how the "FreeBSD" reference 
got in there: somebody saw "Berkeley" and decided "FreeBSD" would look 
more "techie".  It's stupid, but nothng to excoriate the reporter about.

Sometime back we were announcing a product and practically wrote everything
for the journalists and gave it to them so that they could just print it,
and one newspaper still got LOTs of things wrong. In contrast another
newspaper was much better tho - facts right.

The standards haven't changed much, so I don't really bother reading the
first newspaper for a lot of things. Whereas the 2nd one still seems to do
ok for tech stuff. 

They're very rarely 100% correct. But they're primarily journalists, if
they were even 99.99% correct about things they'd probably be releasing
Postgresql 8 instead of you guys ;).

I believe you should choose your battles. Sometimes it's just not worth
fighting, not even worth commenting. Other times it's almost compulsory
even though there's no obvious/direct _personal_ gain in it.

Also look at the various stories and commentary floating about in the media
about the recent US-China plane incident. And what really happened? I
figure at least one of the planes should have a video recording of the
incident. But we have everyone guessing what happened instead. Doh.

Cheerio,
Link.


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])



[HACKERS] Re: Hey guys, check this out.

2001-04-15 Thread Lincoln Yeoh

At 10:59 PM 14-04-2001 -0400, Lamar Owen wrote:
http://www.crn.com/Sections/Fast_Forward/fast_forward.asp?ArticleID=25670

Marc will be pleased to note that the PostgreSQL project came out of the
FreeBSD project, and is Great Bridge's database.  Gotta love
journalistic license.

Reporter must have missed the April 1st deadline.

;)

Still I must point out that the article isn't negative. And if it's read by
PHB's it doesn't make a difference anyway grin- the general gist is: it's
decent, cheap, and virtually as good as proprietary databases.

So what if there are factual errors. This is mass-media we're talking
about. Just point it out if it's really negative AND it's strategically
appropriate to do so. 

By saying there's "nothing factual" does that apply to the following as well?

"Great Bridge is on the forefront of the open-source movement in providing
tools that are enterprisewide and capable, and what's compelling is they
provide the 24x7 support"

I think some of you guys are overreacting. It's almost like those FreeBSD
advocates slamming Tucows or something. 

Maybe you guys should get some Great Bridge marketing/PR person to handle
stuff like this.

Cheerio,
Link.


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])



[HACKERS] Re: SIGTERM/FATAL error

2001-03-12 Thread Lincoln Yeoh

At 08:59 PM 11-03-2001 -0500, Bruce Momjian wrote:
How about "Connection terminated by administrator", or something like
that.

I prefer something closer to the truth.

e.g.
"Received SIGTERM, cancelling query and exiting"
(assuming it actually cancels the query).

But maybe I'm weird.

Cheerio,
Link.


---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://www.postgresql.org/search.mpl



[HACKERS] Re: offset and limit in update and subselect

2001-02-25 Thread Lincoln Yeoh

At 05:07 PM 2/24/01 -0500, Tom Lane wrote:
is not a defined concept according to SQL.  Even if we allowed queries
such as you've described, the results would not be well-defined, but
would change at the slightest provocation.  The implementation feels
itself entitled to rearrange tuple order whenever the whim strikes it.

As the documentation tries hard to make plain, LIMIT/OFFSET are only
guaranteed to produce reproducible results if there's also an ORDER BY
that constrains the tuples into a unique ordering.

Hi,

Would it then be fine to use update ... limit in the following scenario?

I have a todo queue:

create table todo ( task text, pid int default 0);

The tasks are inserted into the todo table.

Then the various worker processes do the following update to grab tasks
without duplication.

update todo set pid=$mypid where pid=0 limit 1;

For me it doesn't matter what which row each worker gets, as long as they
only get one each and they are not the same.

What would the performance impact of "order by" be in a LIMIT X case? Would
it require a full table scan?

Thanks,
Link.




[HACKERS] Re: Re: offset and limit in update and subselect

2001-02-25 Thread Lincoln Yeoh

At 04:58 PM 25-02-2001 -0500, Tom Lane wrote:

There's no LIMIT clause in UPDATE.  You could do something like

Oh. I thought 7.1 had that.

   BEGIN
   SELECT taskid FROM todo WHERE pid = 0 FOR UPDATE LIMIT 1;
   UPDATE todo SET pid = $mypid WHERE taskid = $selectedid;
   COMMIT

This is very similar to what I'm testing out in 7.0.3 - except I'm
currently trying  "order by random" to prevent blocking. This is because
all worker processes will tend to select stuff in the same order (in the
absence of inserts or updates on that table), and thus they will hit the
same first row (this is what I encountered last week - and I got the wrong
impression that all rows were locked).

What would happen if I rewrite that query to:

update todo set pid = $mypid where exists ( select task id from todo where
pid = 0 for update limit 1);

This is pushing it, but I'm curious on what would happen :). 

I'll stick to doing it in two queries, and leave out the "order by random"-
faster select vs low blocking.

Cheerio,
Link.




[HACKERS] Re: offset and limit in update and subselect

2001-02-25 Thread Lincoln Yeoh

At 11:16 PM 25-02-2001 -0500, Tom Lane wrote:

Right.  Only the first row is locked, but that doesn't help any.  "order
by random" sounds like it might be a good answer, if there aren't many
rows that need to be sorted.

Yep. I'll just see what happens in the testing stages.

 What would happen if I rewrite that query to:

 update todo set pid = $mypid where exists ( select task id from todo where
 pid = 0 for update limit 1);

Right now you get 

ERROR:  SELECT FOR UPDATE is not allowed in subselects

This is something that could be fixed if FOR UPDATE were a plan node
instead of a function done at the executor top level.

OK. Sounds like it won't be worth the trouble to do, plus deadlocks would
be real fun ;).

Cheerio,
Link.




[HACKERS] RE: Re: [ADMIN] v7.1b4 bad performance

2001-02-21 Thread Lincoln Yeoh

Oops.

I rechecked the start up script, and the 7.0.3 doesn't have fsync off or
whatever. Dunno why I thought it was on (heh maybe because it was a lot
faster than 6.5.3!).

Hmm, this means 7.0.3 is quite fast...

Cheerio,
Link.





[HACKERS] RE: Re: [ADMIN] v7.1b4 bad performance

2001-02-21 Thread Lincoln Yeoh

Just another data point.

I downloaded a snapshot yesterday - Changelogs dated Feb 20 17:02

It's significantly slower than "7.0.3 with fsync off" for one of my webapps.

7.0.3 with fsync off gets me about 55 hits per sec max (however it's
interesting that the speed keeps dropping with continued tests).
( PostgreSQL 7.0.3 on i686-pc-linux-gnu, compiled by gcc egcs-2.91.66)

For 7.1b4 snapshot I get about 23 hits per second (drops gradually too).
I'm using Pg::DBD compiled using the 7.1 libraries for both tests.
(PostgreSQL 7.1beta4 on i686-pc-linux-gnu, compiled by GCC egcs-2.91.66)

For a simple "select only" webapp I'm getting 112 hits per sec for 7.0.3.
and 109 hits a sec for the 7.1 beta4 snapshot. These results remain quite
stable over many repeated tests.

The first webapp does a rollback, begin, select, update, commit, begin, a
bunch of selects in sequence and rollback. 

So my guess is that the 7.1 updates (with default fsync) are significantly
slower than 7.0.3 fsync=off now. 

But it's interesting that the updates slow things down significantly. Going
from 50 to 30 hits per second after a few thousand hits for 7.0.3, and 23
to 17 after about a thousand hits for 7.1beta4.


For postgresql 7.0.3 to speed things back up from 30 to 60 hits per sec I
had to do:

lylyeoh=# delete from session;
DELETE 1
lylyeoh=# vacuum; vacuum analyze;
VACUUM
NOTICE:  RegisterSharedInvalid: SI buffer overflow
NOTICE:  InvalidateSharedInvalid: cache state reset
VACUUM
(Not sure why the above happened, but I repeated the vacuum again for good
measure)

lylyeoh=# vacuum; vacuum analyze;
VACUUM
VACUUM

Then I ran the apachebench again (after visiting the webpage once to create
the session).

Note that even with only one row in the session table it kept getting
slower and slower as it kept getting updated, even when I kept trying to
vacuum and vacuum analyze it. I had to delete the row and vacuum only then
was there a difference.

I didn't try this on 7.1beta4.

Cheerio,
Link.





[HACKERS] Re: beta5 ...

2001-02-16 Thread Lincoln Yeoh

At 04:17 PM 2/16/01 -0500, Tom Lane wrote:

Vadim says (and I agree) that we really ought to implement a new
lightweight lock manager that would fall between spinlocks and regular
locks in terms of overhead and functionality.  But it's not reasonable

Will there be an arbitrary user locking feature? E.g. lock on arbitrary
text string. That would be great :). 

BTW, is 7.1 going to be a bit slower than 7.0? Or just Beta 5? Just
curious. Don't mind waiting for 7.2 for the speed-up if necessary.

Cheerio,
Link.





[HACKERS] Re: Re: MySQL has transactions

2001-01-26 Thread Lincoln Yeoh

At 10:02 AM 1/25/01 -0500, you wrote:
 When Postgresql 6.5 came out it, it was VERY MUCH better ( many many
thanks
 to the developers and all involved). And I'm waiting for a solid 7.1 to
fix
 that 8KB issue.

Technically..

= BLCKSZ (can be up to  32k)

I've been using PostgreSQL with a 32k BLCKSZ since 7.0 (on a productions
server) and haven't had a problem one..

Yep but doesn't quite help my webmail app :). 

I'm wondering if TOAST is going to be efficient enough for me to plonk
multimegabyte email attachments into the database. 

However I've also a suspicion that there might be problems doing

INSERT INTO mytable (a) values ( 'aa...');

Where aa... is a few megabytes long :). There's probably a query size limit
somewhere between my app and TOAST. 

Cheerio,
Link.




RE: [HACKERS] Lock on arbitrary string feature

2001-01-11 Thread Lincoln Yeoh

At 09:20 AM 11-01-2001 -0800, Mikheev, Vadim wrote:
 In contrast the current alternatives appear to be either LOCK 
 the entire table (preventing ALL inserts and selects),

SHARE ROW EXCLUSIVE mode doesn't prevent selects...

Sorry, I meant all inserts and selects on the locked table. At least so far
it seems to block those selects in 7.0.3 (I hope it does in all cases! If
not uhoh!).

 or to create a UNIQUE constraint (forcing complete rollbacks
 and restarts in event of a collision :( ).

Hopefully, savepoints will be in 7.2

Yep that'll solve some things. Still think the getlock feature will be very
handy in many other cases.

BTW would there be a significant performance/resource hit with savepoints?

 Any comments, suggestions or tips would be welcome. It looks 
 like quite a complex thing to do - I've only just started
 looking at the postgresql internals and the lock manager.

It's very easy to do (from my PoV -:)) We need in yet another
pseudo table like one we use in XactLockTableInsert/XactLockTableWait
- try to look there...

Thanks!

I think by the time I succeed Postgresql will be version 7.2 or even 8 :).

Cheerio,
Link.




Re: [HACKERS] Lock on arbitrary string feature

2001-01-11 Thread Lincoln Yeoh

At 01:26 PM 11-01-2001 -0500, Tom Lane wrote:
Lincoln Yeoh [EMAIL PROTECTED] writes:
 GETLOCK "string" will lock on "string", the lock being only released at the
 end of a transaction.
However, the whole thing strikes me as more of an ugly kluge than a
clean solution to the real problem.  If you're not using a UNIQUE

But doesn't that go well with SQL :). The joys of INSERT vs UPDATE.

And "select .. for update" too! So far I haven't left out any "for
updates", at least I think so ;). 

I did consider using select for update to simulate it but it doesn't work
when the values are very variable.

application-level lock.  So, as Vadim remarked, doing the insert and
rolling back to a savepoint on failure would be a much better answer.

Yep, savepoints will allow better consistency. But a getlock feature can be
very handy in lots of other scenarios.

BTW, you should consider whether you couldn't use the existing USERLOCK
feature as a short-term alternative.  If you can squeeze the key value
you need to insert into a user lock tag, that will do as well as your
proposed general-string-tag locks.

Looks interesting. Probably what it does is similar enough to what I'm
trying to do. Copy from the best :).

But meantime, back to lock table...

Cheerio,
Link.






[HACKERS] Re: Lock on arbitrary string feature

2001-01-11 Thread Lincoln Yeoh

At 09:38 AM 11-01-2001 -0800, Adam Haberlach wrote:
   We do something like this with listen/notify pairs.  To syncronize
two clients, we have them each listen for the other's token string,
send a notify, and then block on select(), checking for incoming
notifications.  When they get the notification, they send a notify back
to the other side to un-block it.

   If anything, it would be nice if there were a way to make a LISTEN
block the connection on a specific event tag, which is essentially what
we are doing in our interface library.

Actually what you are talking about is almost an inverse of this locking
thing. One is stop until it's ok to go. The other is stop if it's not ok to
go.

You're looking for a WAIT for "notification" feature :). 

I actually was looking for this too, and I thought I was the only one
interested in this. Wow a 100% increase in interest ;). 

I'm also trying to see how this can be done. It looks a lot easier to do
than the getlock feature. But I can't figure out what to select/wait/snooze
on, when the routine is in the inside looking about (async.c:
Async_Wait(char *relname) yeah oxymoronic I know). Rather than outside
looking in (in which case it's select PQsocket or something like that).
Would like to use as little CPU as possible when waiting -  think of
postgresql on battery powered wearable "servers" + wireless LAN.

Cheerio,
Link.




[HACKERS] Lock on arbitrary string feature

2001-01-10 Thread Lincoln Yeoh

Hi,

Has anyone any input to offer on adding an arbitrary locking feature?

Where
GETLOCK "string" will lock on "string", the lock being only released at the
end of a transaction.

While the lock is held, other processes trying to do GETLOCK "string" will
block until the lock is released.

This feature can allow applications to better serialize things. For
example: inserting unique records. Cooperating applications could just do
something like:

GETLOCK "mytable.key2=1234";
SELECT count(*) from mytable where key2=1234 for update;
if count==0, insert the stuff.
 elsif count==1 update the stuff instead
 else something is wrong!

The lock will thus only affect applications interested in mytable where
key2=1234

In contrast the current alternatives appear to be either LOCK the entire
table (preventing ALL inserts and selects), or to create a UNIQUE
constraint (forcing complete rollbacks and restarts in event of a collision
:( ).

Any comments, suggestions or tips would be welcome. It looks like quite a
complex thing to do - I've only just started looking at the postgresql
internals and the lock manager.

Cheerio,
Link.