Re: [HACKERS] Unicode support
> "Gregory" == Gregory Stark writes: >>> I don't believe that the standard forbids the use of combining >>> chars at all. RFC 3629 says: >>> >>> ... This issue is amenable to solutions based on Unicode >>> Normalization Forms, see [UAX15]. Gregory> This is the relevant part. Tom was claiming that the UTF8 Gregory> encoding required normalizing the string of unicode Gregory> codepoints before encoding. I'm not sure that's true though, Gregory> is it? FWIW, the SQL spec puts the onus of normalization squarely on the application; the database is allowed to assume that Unicode strings are already normalized, is allowed to behave in implementation-defined ways when presented with strings that aren't normalized, and provision of normalization functions and predicates is just another optional feature. -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: add columns created and altered to pg_proc and pg_class
2009/4/14 Josh Berkus : > >>> - what if I need to know about operators, operator classes, schemas, etc >>> etc >> >> Fine, let's log this info for those too (or else decide they're too >> obscure and don't - pg_class and pg_proc are certainly the most >> interesting cases). > > I would suggest putting this info in a separate table, pg_change. It would > have oid, catalog, user_changed, changed_on. That way we could simply keep > the data for all objects which have an OID. > > This would also supposedly allow us to track drops if we wanted. > > We'd have to check on overhead of this, though, and maybe make it a GUC to > track it. > > This would also be tremendously useful to suppliment replication systems. > I though about it too. But I am not sure, if this isn't too complicated solution for simple task. If I thing little bit more - main important is timestamp of last change. regards Pavel Stehule > -- > Josh Berkus > PostgreSQL Experts Inc. > www.pgexperts.com > -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: add columns created and altered to pg_proc and pg_class
On Mon, Apr 13, 2009 at 7:06 PM, Tom Lane wrote: > Josh Berkus writes: >> I would suggest putting this info in a separate table, pg_change. It >> would have oid, catalog, user_changed, changed_on. That way we could >> simply keep the data for all objects which have an OID. > > That makes more sense to me --- it would easily extend to all cases > and would not impose any overhead (in the form of useless columns) > for catalogs that you didn't want to track in a particular case. > > The main problem that would have to be considered is how to flush > no-longer-useful entries (which of course entails deciding which > those are). I kinda think that the only thing that's going to make sense here is to drop the pg_change entries when the object is dropped. Now, admittedly, that means you can't track drops. But otherwise, you have the potential for pg_change to get really big and full of cruft, and I don't think there's going to be an easy way to garbage collect it. I really like the basic design, though. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] join ordering
On Mon, Apr 13, 2009 at 7:17 PM, Tom Lane wrote: > Robert Haas writes: >> This isn't a very good plan. What we should do is first join the >> values expression against bar, and then join the resulting rows >> against foo. The optimizer doesn't want to do that, and I think the >> reason is because it knows that the left join might introduce null >> values into the result of (VALUES (...) LEFT JOIN bar) which would >> then cause the join against foo to produce different results. > > Exactly. Inner and outer joins don't commute in general. > >> But in >> practice, since foo.id is not null and = is strict, it's equivalent to >> the following, which the planner handles much better. > > Nonsense; those conditions are not sufficient to prove what you wish. > I think it is actually true given that the foreign key relationship > together with the not null on foo_id (NOT foo.id) implies that every row > of bar must have a join partner in foo; but not without that. Yeah, good point. > If we had any FK analysis in the optimizer (which we don't at present) > I think the deduction you'd really want is that foo can be removed from > the query altogether, because actually every row of bar must have > *exactly* one join partner in foo, and we don't care about the values of > foo otherwise. The way I set up this particular example, that's true, but suppose foo had another column which the SELECT pulled into the output. In that case, the FK analysis wouldn't permit removing the join altogether, but it would permit reordering it. I think that: A inner join B on Pab = A leftjoin B on Pab ...given that Pab is a set of equality constraints setting columns of A equal to the columns of B to which they are mapped by a foreign key constraint, and given further that at least one of these columns is NOT NULL in A. In some cases this can be a big win, because it means that this join can commute with either inner joins or left joins (but once we commute it with a left join it turns into a plain left join, and we can't go back to handling it as an inner join). ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Unicode support
- - writes: >>> The original post seemed to be a contrived attempt to say "you should >>> use ICU". >> >> Indeed. The OP should go read all the previous arguments about ICU >> in our archives. > > Not at all. I just was making a suggestion. You may use any other > library or implement it yourself (I even said that in my original > post). www.unicode.org - the official website of the Unicode > consortium, have a complete database of all Unicode characters which > can be used as a basis. > > But if you want to ignore the normalization/multiple code point issue, > point 2--the collation problem--still remains. And given that even a > crappy database as MySQL supports Unicode collation, this isn't > something to be ignored, IMHO. Sure, supporting multiple collations in a database is definitely a known missing feature. There is a lot of work required to do it and a patch to do so was too late to make it into 8.4 and required more work so hopefully the issues will be worked out for 8.5. I suggest you read the old threads and make any contibutions you can suggesting how to solve the problems that arose. >> I don't believe that the standard forbids the use of combining chars at all. >> RFC 3629 says: >> >> ... This issue is amenable to solutions based on Unicode Normalization >> Forms, see [UAX15]. This is the relevant part. Tom was claiming that the UTF8 encoding required normalizing the string of unicode codepoints before encoding. I'm not sure that's true though, is it? -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's PostGIS support! -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] psql with "Function Type" in \df
On Mon, Apr 13, 2009 at 07:24:31PM -0400, Tom Lane wrote: > David Fetter writes: > > Here's a patch that adds a "Function Type" column to \df while > > removing the now-redundant \da. > > Removing \da altogether was nowhere in the consensus, or even in the > discussion AFAIR. It's back. > Also, what is the point of using single-letter type codes when > you've made the column header about as verbose as it could get? I'd > go for something like > > Type > > window > agg > trigger > normal > > Or we could spell out "aggregate", but that makes the column a > couple of characters wider ... Done. I've also added \df[antw], which lets people narrow their search. Cheers, David. -- David Fetter http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml index 10d42ca..272f19b 100644 --- a/doc/src/sgml/ref/psql-ref.sgml +++ b/doc/src/sgml/ref/psql-ref.sgml @@ -1043,14 +1043,15 @@ testdb=> Lists available functions, together with their argument and -return types. If pattern -is specified, only functions whose names match the pattern are shown. -If the form \df+ is used, additional information about -each function, including volatility, language, source code and description, is shown. -By default, only user-created objects are shown; supply a -pattern or the S modifier to include system -objects. +return types and their function type: 'normal', 'agg', +'trigger', and 'window'. If pattern is specified, only +functions whose names match the pattern are shown. If the +form \df+ is used, additional information +about each function, including volatility, language, source +code and description, is shown. By default, only user-created +objects are shown; supply a pattern or the +S modifier to include system objects. @@ -1064,6 +1065,81 @@ testdb=> + +\dfa[S+] [ pattern ] + + + +Lists available aggregate functions, together with their argument and +return types. If pattern is specified, only +aggregate functions whose names match the pattern are shown. +If the form \dfa+ is used, additional +information about each function, including volatility, +language, source code and description, is shown. By default, +only user-created objects are shown; supply a pattern or the +S modifier to include system objects. + + + + + + +\dfn[S+] [ pattern ] + + + +Lists available normal functions, together with their argument +and return types. If pattern is specified, only +normal functions whose names match the pattern are shown. +If the form \dfn+ is used, additional +information about each function, including volatility, +language, source code and description, is shown. By default, +only user-created objects are shown; supply a pattern or the +S modifier to include system objects. + + + + + + +\dft[S+] [ pattern ] + + + +Lists available trigger functions, together with their argument and +return types. If pattern is specified, only +trigger functions whose names match the pattern are shown. +If the form \dft+ is used, additional +information about each function, including volatility, +language, source code and description, is shown. By default, +only user-created objects are shown; supply a pattern or the +S modifier to include system objects. + + + + + + +\dfw[S+] [ pattern ] + + + Lists available windowing functions, together with +their argument and return types. If pattern is specified, only +windowing functions whose names match the pattern are shown. +If the form \dfw+ is used, additional +information about each function, including volatility, +language, source code and description, is shown. By default, +only user-created objects are shown; supply a pattern or the +S modifier to include system objects. + + + + + \dF[+] [ pattern ] diff --git a/doc/src/sgml/release.sgml b/doc/src/sgml/release.sgml index fc56c3d..6802590 100644 --- a/doc/src/sgml/release.sgml +++ b/doc/src/sgml/release.sgml @@ -183,6 +183,16 @@ do it for earlier branch release files. + + + In psql, \df now shows which type of function it is: 'n' for + normal, 'a' for aggregate, 't' for trigg
Re: [HACKERS] [GENERAL] Fragments in tsearch2 headline
Headline generation uses hlCover to get fragments in text with *all* query items. In case there is no such fragment, it does not return anything. What you are asking will either require returning *maximally* matching covers or handling it as a separate case. -Sushant. On Mon, 2009-04-13 at 20:57 -0400, Tom Lane wrote: > Sushant Sinha writes: > > Sorry for the delay. Here is the patch with FragmentDelimiter option. > > It requires an extra option in HeadlineParsedText and uses that option > > during generateHeadline. > > I did some editing of the documentation for this patch and noticed that > the explanation of the fragment-based headline method says > >If not all query words are found in the >document, then a single fragment of the first MinWords >in the document will be displayed. > > (That's what it says now, that is, based on my editing and testing of > the original.) This seems like a pretty dumb fallback approach --- > if you have only a partial match, the headline generation suddenly > becomes about as stupid as it could possibly be. I could understand > doing the above if the text actually contains *none* of the query > words, but surely if it contains some of them we should still select > fragments centered on those words. > > regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [GENERAL] Fragments in tsearch2 headline
Sushant Sinha writes: > Headline generation uses hlCover to get fragments in text with *all* > query items. In case there is no such fragment, it does not return > anything. > What you are asking will either require returning *maximally* matching > covers or handling it as a separate case. Efficiently being useless is still useless --- a headline selection function needs to be robust, not fragile, and not doing anything useful for a partial match sounds pretty fragile to me. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [GENERAL] Fragments in tsearch2 headline
Sushant Sinha writes: > Sorry for the delay. Here is the patch with FragmentDelimiter option. > It requires an extra option in HeadlineParsedText and uses that option > during generateHeadline. I did some editing of the documentation for this patch and noticed that the explanation of the fragment-based headline method says If not all query words are found in the document, then a single fragment of the first MinWords in the document will be displayed. (That's what it says now, that is, based on my editing and testing of the original.) This seems like a pretty dumb fallback approach --- if you have only a partial match, the headline generation suddenly becomes about as stupid as it could possibly be. I could understand doing the above if the text actually contains *none* of the query words, but surely if it contains some of them we should still select fragments centered on those words. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Solution of the file name problem of copy on windows.
Tom Lane wrote: > Itagaki Takahiro writes: > > Here is a patch to implement GetPlatformEncoding() and convert absolute > > file paths from database encoding to platform encoding. > > This seems like a fairly significant overhead added to solve a really > minor problem (if it's not minor why has it never come up before?). It's not always a minor problem in Japan. It has been discussed in users group in Japan several times. However, surely I should pay attention to the performance. One of the solutions might be to cache the encoding in GetPlatformEncoding(). There will be no overheads when database encoding and platform encoding are same, that would be a typical use. > It should not be necessary to repeat all > this for every file access within the database directory. That's why I added checking with is_absolute_path() there. We can avoid conversion in normal file access under PGDATA because relative paths are used for it. But I should have checked all of file access not only in backends but also in client programs. I'll research them... Regards, --- ITAGAKI Takahiro NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] psql with "Function Type" in \df
-BEGIN PGP SIGNED MESSAGE- Hash: RIPEMD160 > Here's a patch that adds a "Function Type" column to \df while > removing the now-redundant \da. 1. How does it make it redundant - is there a way to view all aggregates with \df now? 2. Even if the above is satisfied, I think we need a little more discussion before completely removing a now-functioning backslash command. Other than that, +1 ;) - -- Greg Sabino Mullane g...@turnstep.com End Point Corporation PGP Key: 0x14964AC8 200904131945 http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8 -BEGIN PGP SIGNATURE- iEYEAREDAAYFAknjzrgACgkQvJuQZxSWSsjoWQCfSNxVbmL85Z6FDMQOu8rmsHxh wuwAn1t0pmQ8cqI/e3m+3eADi7cMTGOm =wKRy -END PGP SIGNATURE- -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] psql with "Function Type" in \df
David Fetter writes: > Here's a patch that adds a "Function Type" column to \df while > removing the now-redundant \da. Removing \da altogether was nowhere in the consensus, or even in the discussion AFAIR. Also, what is the point of using single-letter type codes when you've made the column header about as verbose as it could get? I'd go for something like Type window agg trigger normal Or we could spell out "aggregate", but that makes the column a couple of characters wider ... regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] join ordering
Robert Haas writes: > This isn't a very good plan. What we should do is first join the > values expression against bar, and then join the resulting rows > against foo. The optimizer doesn't want to do that, and I think the > reason is because it knows that the left join might introduce null > values into the result of (VALUES (...) LEFT JOIN bar) which would > then cause the join against foo to produce different results. Exactly. Inner and outer joins don't commute in general. > But in > practice, since foo.id is not null and = is strict, it's equivalent to > the following, which the planner handles much better. Nonsense; those conditions are not sufficient to prove what you wish. I think it is actually true given that the foreign key relationship together with the not null on foo_id (NOT foo.id) implies that every row of bar must have a join partner in foo; but not without that. If we had any FK analysis in the optimizer (which we don't at present) I think the deduction you'd really want is that foo can be removed from the query altogether, because actually every row of bar must have *exactly* one join partner in foo, and we don't care about the values of foo otherwise. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: add columns created and altered to pg_proc and pg_class
Josh Berkus writes: > I would suggest putting this info in a separate table, pg_change. It > would have oid, catalog, user_changed, changed_on. That way we could > simply keep the data for all objects which have an OID. That makes more sense to me --- it would easily extend to all cases and would not impose any overhead (in the form of useless columns) for catalogs that you didn't want to track in a particular case. The main problem that would have to be considered is how to flush no-longer-useful entries (which of course entails deciding which those are). regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: add columns created and altered to pg_proc and pg_class
- what if I need to know about operators, operator classes, schemas, etc etc Fine, let's log this info for those too (or else decide they're too obscure and don't - pg_class and pg_proc are certainly the most interesting cases). I would suggest putting this info in a separate table, pg_change. It would have oid, catalog, user_changed, changed_on. That way we could simply keep the data for all objects which have an OID. This would also supposedly allow us to track drops if we wanted. We'd have to check on overhead of this, though, and maybe make it a GUC to track it. This would also be tremendously useful to suppliment replication systems. -- Josh Berkus PostgreSQL Experts Inc. www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] join ordering
I have a query that performs poorly which can be simplified to the following test case (v8.3.6). CREATE TABLE foo (id integer, primary key (id)); INSERT INTO foo SELECT generate_series(1,10); CREATE TABLE bar (id integer, foo_id integer not null references foo (id), PRIMARY KEY (id)); INSERT INTO bar SELECT g, g % 10 + 1 FROM generate_series(1,1) g; ANALYZE; EXPLAIN ANALYZE SELECT v.id FROM (VALUES (1, 1)) v (id, bar_id) LEFT JOIN (bar JOIN foo ON bar.foo_id = foo.id) ON v.bar_id = bar.id; QUERY PLAN Nested Loop Left Join (cost=1.23..408.74 rows=1 width=4) (actual time=0.405..63.585 rows=1 loops=1) Join Filter: ("*VALUES*".column2 = bar.id) -> Values Scan on "*VALUES*" (cost=0.00..0.01 rows=1 width=8) (actual time=0.015..0.017 rows=1 loops=1) -> Hash Join (cost=1.23..283.73 rows=1 width=4) (actual time=0.367..49.029 rows=1 loops=1) Hash Cond: (bar.foo_id = foo.id) -> Seq Scan on bar (cost=0.00..145.00 rows=1 width=8) (actual time=0.042..15.562 rows=1 loops=1) -> Hash (cost=1.10..1.10 rows=10 width=4) (actual time=0.143..0.143 rows=10 loops=1) -> Seq Scan on foo (cost=0.00..1.10 rows=10 width=4) (actual time=0.086..0.105 rows=10 loops=1) Total runtime: 63.893 ms (9 rows) This isn't a very good plan. What we should do is first join the values expression against bar, and then join the resulting rows against foo. The optimizer doesn't want to do that, and I think the reason is because it knows that the left join might introduce null values into the result of (VALUES (...) LEFT JOIN bar) which would then cause the join against foo to produce different results. But in practice, since foo.id is not null and = is strict, it's equivalent to the following, which the planner handles much better. EXPLAIN ANALYZE SELECT v.id FROM (VALUES (1, 1)) v (id, bar_id) LEFT JOIN (bar LEFT JOIN foo ON bar.foo_id = foo.id) ON v.bar_id = bar.id; QUERY PLAN -- Nested Loop Left Join (cost=0.00..8.57 rows=1 width=4) (actual time=0.079..0.150 rows=1 loops=1) -> Nested Loop Left Join (cost=0.00..8.29 rows=1 width=8) (actual time=0.058..0.120 rows=1 loops=1) -> Values Scan on "*VALUES*" (cost=0.00..0.01 rows=1 width=8) (actual time=0.006..0.008 rows=1 loops=1) -> Index Scan using bar_pkey on bar (cost=0.00..8.27 rows=1 width=8) (actual time=0.039..0.044 rows=1 loops=1) Index Cond: ("*VALUES*".column2 = bar.id) -> Index Scan using foo_pkey on foo (cost=0.00..0.27 rows=1 width=4) (actual time=0.012..0.015 rows=1 loops=1) Index Cond: (bar.foo_id = foo.id) Total runtime: 0.312 ms (8 rows) ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: add columns created and altered to pg_proc and pg_class
Pavel Stehule writes: > this my proposal is very simple. It help to people who have to manage > large or complex database system. Important data are date of creating > and date of altering tables and stored procedures. These data cannot > be modified by user, so implementation doesn't need any new > statements. ISTM anyone who thinks they need this actually need a full DDL log; or at least, if we give them this, they will be back next week asking for a full log. So it'd save a lot of work to tell them to just log their DDL to start with. Some obvious objections to the simple approach: - what if I want to know *who* made the change - what if I need to know about the change before last - what if I need to know about a DROP - what if I need to know about operators, operator classes, schemas, etc etc regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: add columns created and altered to pg_proc and pg_class
2009/4/13 Kevin Grittner : > Pavel Stehule wrote: >> Important data are date of creating and date of altering tables >> and stored procedures. These data cannot be modified by user, so >> implementation doesn't need any new statements. >> >> Notes, objections? > > This feature has been present in other database products I've used, > and I occasionally miss it. > > PostgreSQL doesn't have stored procedures, but has functions. > sure, I know :) > This doesn't sound like something to propose past feature freeze, so > I assume you're asking about a potential 8.5 feature. > yes, it's for 8.5 Pavel > -Kevin > -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Regression failure on RHEL 4 w/ PostgreSQL 8.4 beta1
Hi, I'm getting the following failure on RHEL 4: http://www.gunduz.org/temp/regression.out http://www.gunduz.org/temp/regression.diffs Here is the Makefile.regress that I use while building RPMs on 8.4: https://projects.commandprompt.com/public/pgcore/repo/rpm/redhat/8.4/postgresql/EL-4/Makefile.regress Is that with the Makefile, or with OS? Regards, -- Devrim GÜNDÜZ, RHCE devrim~gunduz.org, devrim~PostgreSQL.org, devrim.gunduz~linux.org.tr http://www.gunduz.org signature.asc Description: This is a digitally signed message part
Re: [HACKERS] proposal: add columns created and altered to pg_proc and pg_class
Pavel Stehule wrote: > Important data are date of creating and date of altering tables > and stored procedures. These data cannot be modified by user, so > implementation doesn't need any new statements. > > Notes, objections? This feature has been present in other database products I've used, and I occasionally miss it. PostgreSQL doesn't have stored procedures, but has functions. This doesn't sound like something to propose past feature freeze, so I assume you're asking about a potential 8.5 feature. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Unicode support
Tom Lane wrote: > Greg Stark writes: >> Is it really true trhat canonical encodings never contain any composed >> characters in them? I thought there were some glyphs which could only >> be represented by composed characters. > > AFAIK that's not true. However, in my original comment I was thinking > about UTF16 surrogates, which are something else entirely --- so I > withdraw that. I'm still dubious that it is our job to deal with > non-normalized characters, though. Like it or not, they are part of Unicode and they are very much valid Unicode. They are not in violation with the standard. This has nothing to do with the encoding. There are also code points which specify the direction of text (e.g. needed if you want to embed a Hebrew quote in English text). To count that as a character seems wrong. >> The original post seemed to be a contrived attempt to say "you should >> use ICU". > > Indeed. The OP should go read all the previous arguments about ICU > in our archives. Not at all. I just was making a suggestion. You may use any other library or implement it yourself (I even said that in my original post). www.unicode.org - the official website of the Unicode consortium, have a complete database of all Unicode characters which can be used as a basis. But if you want to ignore the normalization/multiple code point issue, point 2--the collation problem--still remains. And given that even a crappy database as MySQL supports Unicode collation, this isn't something to be ignored, IMHO. - Hide quoted text - Andrew Dunstan wrote: > > > Tom Lane wrote: >> >> Andrew Dunstan writes: >> >>> >>> This isn't about the number of bytes, but about whether or not we should >>> count characters encoded as two or more combined code points as a single >>> char or not. >>> >> >> It's really about whether we should support non-canonical encodings. >> AFAIK that's a hack to cope with implementations that are restricted >> to UTF-16, and we should Just Say No. Clients that are sending these >> things converted to UTF-8 are in violation of the standard. >> > > I don't believe that the standard forbids the use of combining chars at all. > RFC 3629 says: > > Security may also be impacted by a characteristic of several > character encodings, including UTF-8: the "same thing" (as far as a > user can tell) can be represented by several distinct character > sequences. For instance, an e with acute accent can be represented > by the precomposed U+00E9 E ACUTE character or by the canonically > equivalent sequence U+0065 U+0301 (E + COMBINING ACUTE). Even though > UTF-8 provides a single byte sequence for each character sequence, > the existence of multiple character sequences for "the same thing" > may have security consequences whenever string matching, indexing, > searching, sorting, regular expression matching and selection are > involved. An example would be string matching of an identifier > appearing in a credential and in access control list entries. This > issue is amenable to solutions based on Unicode Normalization Forms, > see [UAX15]. > Exactly my point. Best Regards. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Unicode support
Tom Lane wrote: Andrew Dunstan writes: This isn't about the number of bytes, but about whether or not we should count characters encoded as two or more combined code points as a single char or not. It's really about whether we should support non-canonical encodings. AFAIK that's a hack to cope with implementations that are restricted to UTF-16, and we should Just Say No. Clients that are sending these things converted to UTF-8 are in violation of the standard. I don't believe that the standard forbids the use of combining chars at all. RFC 3629 says: Security may also be impacted by a characteristic of several character encodings, including UTF-8: the "same thing" (as far as a user can tell) can be represented by several distinct character sequences. For instance, an e with acute accent can be represented by the precomposed U+00E9 E ACUTE character or by the canonically equivalent sequence U+0065 U+0301 (E + COMBINING ACUTE). Even though UTF-8 provides a single byte sequence for each character sequence, the existence of multiple character sequences for "the same thing" may have security consequences whenever string matching, indexing, searching, sorting, regular expression matching and selection are involved. An example would be string matching of an identifier appearing in a credential and in access control list entries. This issue is amenable to solutions based on Unicode Normalization Forms, see [UAX15]. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Unicode support
Greg Stark writes: > Is it really true trhat canonical encodings never contain any composed > characters in them? I thought there were some glyphs which could only > be represented by composed characters. AFAIK that's not true. However, in my original comment I was thinking about UTF16 surrogates, which are something else entirely --- so I withdraw that. I'm still dubious that it is our job to deal with non-normalized characters, though. > The original post seemed to be a contrived attempt to say "you should > use ICU". Indeed. The OP should go read all the previous arguments about ICU in our archives. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Unicode support
On Mon, Apr 13, 2009 at 9:15 PM, Tom Lane wrote: > Andrew Dunstan writes: >> This isn't about the number of bytes, but about whether or not we should >> count characters encoded as two or more combined code points as a single >> char or not. > > It's really about whether we should support non-canonical encodings. > AFAIK that's a hack to cope with implementations that are restricted > to UTF-16, and we should Just Say No. Clients that are sending these > things converted to UTF-8 are in violation of the standard. Is it really true trhat canonical encodings never contain any composed characters in them? I thought there were some glyphs which could only be represented by composed characters. Also, users can construct strings of unicode code points themselves in SQL using || or other text operators. That said, my impression is that composed character support is pretty thin on the ground elsewhere as well, but I don't have much first-hand experience. The original post seemed to be a contrived attempt to say "you should use ICU". If composed character support were a show-stopper and there was no other way to get it then it might be convincing, but I don't know that it is and I don't know that ICU is the only place to get it. And I'm sure it's not the only way to handle multiple encodings in a database. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Unicode support
Andrew Dunstan writes: > This isn't about the number of bytes, but about whether or not we should > count characters encoded as two or more combined code points as a single > char or not. It's really about whether we should support non-canonical encodings. AFAIK that's a hack to cope with implementations that are restricted to UTF-16, and we should Just Say No. Clients that are sending these things converted to UTF-8 are in violation of the standard. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: add columns created and altered to pg_proc and pg_class
On Mon, Apr 13, 2009 at 1:32 PM, Tom Lane wrote: > Pavel Stehule writes: >> this my proposal is very simple. It help to people who have to manage >> large or complex database system. Important data are date of creating >> and date of altering tables and stored procedures. These data cannot >> be modified by user, so implementation doesn't need any new >> statements. > > ISTM anyone who thinks they need this actually need a full DDL log; while i agree with that, what i actually think will be useful is to have the DDL log in a separate file... because it is a security log and probably i will want different info than in a log to measure average performance -- Atentamente, Jaime Casanova Soporte y capacitación de PostgreSQL Asesoría y desarrollo de sistemas Guayaquil - Ecuador Cel. +59387171157 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: add columns created and altered to pg_proc and pg_class
On Mon, Apr 13, 2009 at 2:32 PM, Tom Lane wrote: > Pavel Stehule writes: >> this my proposal is very simple. It help to people who have to manage >> large or complex database system. Important data are date of creating >> and date of altering tables and stored procedures. These data cannot >> be modified by user, so implementation doesn't need any new >> statements. > > ISTM anyone who thinks they need this actually need a full DDL log; > or at least, if we give them this, they will be back next week asking > for a full log. So it'd save a lot of work to tell them to just log > their DDL to start with. DDL logs are good, but you generally can't keep them around forever, so it's helpful to have some basic information that occupies O(1) space. So based on that I'd respond to these objections as follows: > Some obvious objections to the simple approach: > - what if I want to know *who* made the change Fine, let's log the OID of the creator and of the person who made the last change, too (or else decide that the dependency problems are too thorny - we'd need to set this to NULL if a role is dropped - and don't). > - what if I need to know about the change before last Tough, you should have a DDL log. > - what if I need to know about a DROP Tough, you should have a DDL log. > - what if I need to know about operators, operator classes, schemas, etc > etc Fine, let's log this info for those too (or else decide they're too obscure and don't - pg_class and pg_proc are certainly the most interesting cases). In my applications, these requirements comes up frequently for user data and I've handled it by adding creation_time, creator_id, last_updated_time, last_updater_id columns to nearly every table that users can modify. It satisfies 90% of the auditing requirements for 10% of the work, and there's nothing to say that a more elaborate mechanism can't be built and used where necessary (which I have also done - but only for particularly critical data). ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Unicode support
Alvaro Herrera wrote: - - wrote: 1) Functions like char_length() or length() do NOT return the number of characters (the manual says they do), instead they return the number of code points. I think you have client_encoding misconfigured. alvherre=# select length('á'::text); length 1 (1 fila) Umm, but isn't that because your encoding is using one code point? See the OP's explanation w.r.t. canonical equivalence. This isn't about the number of bytes, but about whether or not we should count characters encoded as two or more combined code points as a single char or not. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Unicode support
Alvaro Herrera wrote: >> 1) Functions like char_length() or length() do NOT return the number >> of characters (the manual says they do), instead they return the >> number of code points. > > I think you have client_encoding misconfigured. > > alvherre=# select length('á'::text); > length > > 1 > (1 fila) The OP didn't say it returned the number of bytes. Since you found that this character was stored in only two bytes, it must have been one two-byte code point. I think storing it as two code points would have taken at least three bytes (one for the letter and two for the accent), no? -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Regression failure on RHEL 4 w/ PostgreSQL 8.4 beta1
Devrim =?ISO-8859-1?Q?G=DCND=DCZ?= writes: > I'm getting the following failure on RHEL 4: > http://www.gunduz.org/temp/regression.out > http://www.gunduz.org/temp/regression.diffs This test is checking whether you have working 64-bit-tzdata support. It seems you don't. If you built with --with-system-tzdata, and RHEL4 doesn't include 64-bit tzdata files, then this failure would be expected. (I'm not totally sure about the second premise, but some quick digging in the specfile's changelog suggests that Red Hat only started to support 64-bit tzdata in RHEL5.) I'm not sure whether to recommend not using --with-system-tzdata. If you don't then routine platform updates of tzdata won't help Postgres; that might or might not be worse than not having working post-2038 DST calculations. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Unicode support
- - wrote: > 1) Functions like char_length() or length() do NOT return the number > of characters (the manual says they do), instead they return the > number of code points. I think you have client_encoding misconfigured. alvherre=# select length('á'::text); length 1 (1 fila) alvherre=# select pg_column_size('á'); pg_column_size 3 (1 fila) (there's one byte of overhead here, so á is two bytes) -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: add columns created and altered to pg_proc and pg_class
Tom Lane wrote: > Pavel Stehule writes: >> this my proposal is very simple. It help to people who have to >> manage large or complex database system. Important data are date of >> creating and date of altering tables and stored procedures. These >> data cannot be modified by user, so implementation doesn't need any >> new statements. > > ISTM anyone who thinks they need this actually need a full DDL log; > or at least, if we give them this, they will be back next week > asking for a full log. So it'd save a lot of work to tell them to > just log their DDL to start with. > > Some obvious objections to the simple approach: > - what if I want to know *who* made the change > - what if I need to know about the change before last > - what if I need to know about a DROP > - what if I need to know about operators, operator classes, schemas, > etc etc Well, in a situation where you've got 80-some production databases and dozens of development databases (the number changes from day to day as now projects create code forks and other merge back in) it is occasionally useful to get simple information such as Pavel proposes from the system tables. I don't think that anyone would expect the system tables to track the complete history -- just save someone time tracking down the complete record when such simple information would suffice. In terms of value -- I was wishing I had it just last week -- it would have saved me a few minutes. It was probably two or three months prior to that I last wished for it. Definitely not huge from my perspective -- just an occasional convenience which some other DBMS products provide. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] psql with "Function Type" in \df
Folks, Here's a patch that adds a "Function Type" column to \df while removing the now-redundant \da. Cheers, David. -- David Fetter http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml index 10d42ca..4cd1d27 100644 --- a/doc/src/sgml/ref/psql-ref.sgml +++ b/doc/src/sgml/ref/psql-ref.sgml @@ -873,22 +873,6 @@ testdb=> - -\da[S] [ pattern ] - - - -Lists all available aggregate functions, together with their -return type and the data types they operate on. If pattern -is specified, only aggregates whose names match the pattern are shown. -By default, only user-created objects are shown; supply a -pattern or the S modifier to include system -objects. - - - - \db[+] [ pattern ] @@ -1043,11 +1027,13 @@ testdb=> Lists available functions, together with their argument and -return types. If pattern -is specified, only functions whose names match the pattern are shown. -If the form \df+ is used, additional information about -each function, including volatility, language, source code and description, is shown. +return types and their function type: 'n' for normal, 'a' for +aggregates, 't' for trigger, and 'w' for windowing. If +pattern is +specified, only functions whose names match the pattern are +shown. If the form \df+ is used, +additional information about each function, including +volatility, language, source code and description, is shown. By default, only user-created objects are shown; supply a pattern or the S modifier to include system objects. diff --git a/doc/src/sgml/release.sgml b/doc/src/sgml/release.sgml index fc56c3d..469397a 100644 --- a/doc/src/sgml/release.sgml +++ b/doc/src/sgml/release.sgml @@ -183,6 +183,14 @@ do it for earlier branch release files. + + + The \da command is no longer in psql. Instead, \df now shows + which type of function it is: 'n' for normal, 'a' for aggregate, + 't' for trigger, and 'w' for windowing. + + + diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c index b39466d..3a1c8a4 100644 --- a/src/bin/psql/command.c +++ b/src/bin/psql/command.c @@ -347,9 +347,6 @@ exec_command(const char *cmd, /* standard listing of interesting things */ success = listTables("tvs", NULL, show_verbose, show_system); break; - case 'a': - success = describeAggregates(pattern, show_verbose, show_system); - break; case 'b': success = describeTablespaces(pattern, show_verbose); break; diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c index 731baf8..1b9ae49 100644 --- a/src/bin/psql/describe.c +++ b/src/bin/psql/describe.c @@ -48,77 +48,6 @@ static void printACLColumn(PQExpBuffer buf, const char *colname); * */ - -/* \da - * Takes an optional regexp to select particular aggregates - */ -bool -describeAggregates(const char *pattern, bool verbose, bool showSystem) -{ - PQExpBufferData buf; - PGresult *res; - printQueryOpt myopt = pset.popt; - - initPQExpBuffer(&buf); - - printfPQExpBuffer(&buf, - "SELECT n.nspname as \"%s\",\n" - " p.proname AS \"%s\",\n" - " pg_catalog.format_type(p.prorettype, NULL) AS \"%s\",\n", - gettext_noop("Schema"), - gettext_noop("Name"), - gettext_noop("Result data type")); - - if (pset.sversion >= 80200) - appendPQExpBuffer(&buf, - " CASE WHEN p.pronargs = 0\n" - "THEN CAST('*' AS pg_catalog.text)\n" - "ELSE\n" - " pg_catalog.array_to_string(ARRAY(\n" - " SELECT\n" - " pg_catalog.format_type(p.proargtypes[s.i], NULL)\n" - " FROM\n" - " pg_catalog.generate_series(0, pg_catalog.array_upper(p.proargtypes, 1)) AS s(i)\n" -
Re: [HACKERS] proposal: add columns created and altered to pg_proc and pg_class
2009/4/13 Tom Lane : > Pavel Stehule writes: >> this my proposal is very simple. It help to people who have to manage >> large or complex database system. Important data are date of creating >> and date of altering tables and stored procedures. These data cannot >> be modified by user, so implementation doesn't need any new >> statements. > > ISTM anyone who thinks they need this actually need a full DDL log; > or at least, if we give them this, they will be back next week asking > for a full log. So it'd save a lot of work to tell them to just log > their DDL to start with. Yes, it is solution, but this method isn't too much practical. You have to do some grep and regular-expressions gaming for to get some info. It's similar info about autovacuum. > > Some obvious objections to the simple approach: > - what if I want to know *who* made the change it should be interesting, but I see two problems - a) user's should be dropped, b) lot of firms use some special no login user for creating objects. But it has sense. > - what if I need to know about the change before last it is out of relation databases. Should be solved via triggers on DDL statements. You hypothetical request going to much far - you should to store state before, DDL statement, . In this moment I don't would to create complete system like CVS. And I would not do it in future. When PostgreSQL will have some DDL statement triggers or some similar, then people will do own complete systems. For me, time of last change is basic property like owner, size, name ... > - what if I need to know about a DROP it's similar to creation time and modify time in file systems. When you need some special, you need some special techniques - watchers, ... > - what if I need to know about operators, operator classes, schemas, etc > etc + databases, all is important, and why not? All your objections are regular, but I propose some what I thing is most important, and other should be added later (step by step). > I am sure, so all these information should be taken from outer sources, and reason, why I should it, should be removed via some procedural or organisation rules. One sample from my current practise. I am working as database architect in large firm. I have full access to postgres on develop, deploy and preprod environment. But I haven't any special access on production. I am able to see logs on production. But when I would to check if some patches was applied, then I have to search in logs - and logs are parted and rotated - for me is important information about last change on tables or functions. Of course, there are other kind of objects, but I never needed this info. regards Pavel Stehule > regards, tom lane > -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [BUGS] BUG #4027: backslash escapingnotdisabledinplpgsql
Tom Lane wrote: > "Kevin Grittner" writes: >> a change to CREATE FUNCTION such that there is an implied SET >> standard_compliant_strings FROM CURRENT Hopefully obvious, I meant standard_conforming_strings. > it seems like a really bad idea. Then perhaps a note in the PL/pgSQL docs about the importance of specifying that clause if the function contains any character string literals which include a backslash? Such a note should probably point out that without this clause, the runtime value of any such literal will be dependent on the value of standard_conforming_strings when the plan is generated. I think that many will find that behavior surprising; so if it's not feasible to change it, we should at least document it. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Affected rows count by rule as condtition
This will deny insert of value that allready exists. Which is ok. But the second scenerio in which unique constraint refuse operation is, when u try to update more rows to same value in column with unique constraint. So i need to use count of affected rows, to deny operation if there are more then one. I am using rules as layer to save every version of row in shadow table, so i cant use unique constraint on column, because of many versions may have same value. Robert Haas wrote: On Mon, Apr 13, 2009 at 12:12 PM, mito wrote: Hi, is there any way how to count affected rows by on update rule and use it as part of condtions. Example: CREATE OR REPLACE RULE "_UPDATE" AS ON UPDATE TO "users" DO INSTEAD ( UPDATE "s_users" SET id = new.id, login = new.login, WHERE id IN (SELECT id FROM "s_users" ) AND 2 > (SELECT count(new.id)) ; Error: agregate functions not allowed in WHERE statement It need to simulate unique constraint on field s_users.new_id, so it should deny to update multiple rows with same value. Any suggestions are welcome. Well, you could probably make this compile by rewriting the broken part as "SELECT SUM(1) FROM s_users WHERE id = NEW.id", but it won't guarantee uniqueness in the face of concurrent transactions, even if you use SERIALIZABLE mode. There's a reason that unique constraints are built into the database you should use them. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] proposal: add columns created and altered to pg_proc and pg_class
Hello, this my proposal is very simple. It help to people who have to manage large or complex database system. Important data are date of creating and date of altering tables and stored procedures. These data cannot be modified by user, so implementation doesn't need any new statements. Notes, objections? Regards Pavel Stehule -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Unicode support
Hi. While PostgreSQL is a great database, it lacks some fundamental Unicode support. I want to present some points that have--to my knowledge--not been addressed so far. In the following text, it is assumed that the database and client encoding is UTF-8. 1) Functions like char_length() or length() do NOT return the number of characters (the manual says they do), instead they return the number of code points. To illustrate the difference, I want to quote a Wikipedia article (http://en.wikipedia.org/wiki/Unicode#Ready-made_versus_composite_characters): "For example é can be represented in Unicode as U+0065 (Latin small letter e) followed by U+0301 (combining acute) but it can also be represented as the precomposed character U+00E9 (Latin small letter e with acute). So in many cases, users have many ways of encoding the same character. To deal with this, Unicode provides the mechanism of canonical equivalence." Also, the character U+1EBF (ế), used in Vietnamese has both an acute and a circumflex accent. An equivalent code point sequence is U+0045 (E) U+0302 (circumflex accent) U+0301 (acute accent). So, in the first example, the Unicode codepoints U+0065 U+0301 are infact _one_ character (and are valid Unicode). These two codepoints should be rendered as one character and be indistinguishable from U+00E9. char_length() and length() are agnostic of this. Also, there are quite a few zero-width code points (like code points reserved for surrogate pairs, byte-order-masks, etc) which are not characters at all. As described in 3), there also exist Unicode normalization algorithms, which can decompose one character into multiple code points. In some Asian scripts, this may boost the number of codepoints to a much higher number than the number of actual characters. I guess a quick fix would be to change the definition of length() to return the number of code points in case of an Unicode encoding. 2) PG has no support for the Unicode collation algorithm. Collation is offloaded to the OS, which makes this quite inflexible. This point is further described here in case of the website last.fm: http://russ.garrett.co.uk/tag/postgresql/ . This article also contains a link to a quick fix in form of a 3rd party module. However, in my humble opinion, this functionality should be part of PG itself: the collation algorithm provided by Unicode allows many different locales to be specified. For example, you may want to sort text using the 'de_DE' locale in one query, and use the 'en_GB' locale in the next query. This functionality may be needed in databases which store multi-lingual text. There are also some characters or sequence of characters which are considered equivalent in some languages. As an example 'Umlaut a' is considered to be the same as the character sequence 'ae' in German. You might argue: okay, then just set your OS locale to German. But what happens if text from more languages than just German is stored in the database? Their are multiple instances of such equivalent characters in many languages. Some languages like traditional Spanish treats 'ch' as one character which is sorted between 'c' and 'd'. German does not. Storing both German and Spanish text in a database requires flexible collation, but this cannot be done, if the collation is done at the OS level. In the latter case, you have to set a locale at the beginning and stick with it till the end of time. Also, the functions upper() and lower() depend on the locale. 3) PG has no support for Unicode normalization. This relates to the problem explained in 1). Because Unicode often has multiple code point sequences to represent the same character, it is necessary to bring Unicode strings into a "normalized" form, in order to compare them (Unicode has 4 normalization forms, as explained in http://en.wikipedia.org/wiki/Unicode_normalization). If you just compare by code points, visually indistinguishable characters may not compare as equal, if they have a different code point sequence (for example, if they come from different sources. That is, two users). I'd even go so far as to say that this might pose a security problem. Consider a security-relevant function which compares two strings and misses the instances where the code point sequence is not the same, but the character sequence is. The user is totally unaware of this, since the different code point sequences appear as the same character sequence to him/her, if the font and rendering is done correctly. 4) I don't know if it's easier to just implement a new type 'unicode' that deals with all the unique Unicode characteristics, or to extend the already existing functions that deal with 'text' types. But I think to just ignore all these points is quite dangerous, considering the gain in popularity Unicode has experienced so far and the necessity to support more than just the Latin alphabet. If you consider implementing the points I explained above, the ICU library might be an option. It's ope
Re: [HACKERS] Affected rows count by rule as condtition
On Mon, Apr 13, 2009 at 12:59 PM, mito wrote: > I am using rules as layer to save every version of row in shadow table, so i > cant use unique constraint on column, because of many versions may have same > value. Use a partial index. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Affected rows count by rule as condtition
I am using rules as layer to save every version of row in shadow table, so i cant use unique constraint on column, because of many versions may have same value. mito Jaime Casanova wrote: On Mon, Apr 13, 2009 at 11:12 AM, mito wrote: It need to simulate unique constraint on field s_users.new_id, so it should deny to update multiple rows with same value. Any suggestions are welcome. why not simply create a UNIQUE constraint? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] New trigger option of pg_standby
On Mon, 2009-04-13 at 14:52 +0900, Fujii Masao wrote: > if (triggered) > { > if (smartMode && nextWALfile exists) > exit(0) > else > { > delete trigger file > exit(1) > } > } This looks to be the correct one. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Affected rows count by rule as condtition
On Mon, Apr 13, 2009 at 12:12 PM, mito wrote: > Hi, > is there any way how to count affected rows by on update rule and use it as > part of condtions. > > Example: > > > CREATE OR REPLACE RULE "_UPDATE" AS ON UPDATE TO "users" DO INSTEAD ( > UPDATE "s_users" SET > id = new.id, > login = new.login, > WHERE id IN (SELECT id FROM "s_users" ) AND 2 > (SELECT count(new.id)) ; > > Error: agregate functions not allowed in WHERE statement > > It need to simulate unique constraint on field s_users.new_id, so it should > deny to update multiple rows with same value. > > Any suggestions are welcome. Well, you could probably make this compile by rewriting the broken part as "SELECT SUM(1) FROM s_users WHERE id = NEW.id", but it won't guarantee uniqueness in the face of concurrent transactions, even if you use SERIALIZABLE mode. There's a reason that unique constraints are built into the database you should use them. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Affected rows count by rule as condtition
On Mon, Apr 13, 2009 at 11:12 AM, mito wrote: > > It need to simulate unique constraint on field s_users.new_id, so it should > deny to update multiple rows with same value. > > Any suggestions are welcome. > why not simply create a UNIQUE constraint? -- Atentamente, Jaime Casanova Soporte y capacitación de PostgreSQL Asesoría y desarrollo de sistemas Guayaquil - Ecuador Cel. +59387171157 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Affected rows count by rule as condtition
Hi, is there any way how to count affected rows by on update rule and use it as part of condtions. Example: CREATE OR REPLACE RULE "_UPDATE" AS ON UPDATE TO "users" DO INSTEAD ( UPDATE "s_users" SET id = new.id, login = new.login, WHERE id IN (SELECT id FROM "s_users" ) AND 2 > (SELECT count(new.id)) ; Error: agregate functions not allowed in WHERE statement It need to simulate unique constraint on field s_users.new_id, so it should deny to update multiple rows with same value. Any suggestions are welcome. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Solution of the file name problem of copy on windows.
Hi. Anyhow, I appreciate discussion. - Original Message - From: "Tom Lane" Itagaki Takahiro writes: Here is a patch to implement GetPlatformEncoding() and convert absolute file paths from database encoding to platform encoding. This seems like a fairly significant overhead added to solve a really minor problem (if it's not minor why has it never come up before?). I'm also not convinced by any of the details --- why are GetACP and pg_get_encoding_from_locale the things to look at, and why is fd.c an appropriate place to hook in? Surely if we need it here, we need it in places like initdb as well. But really this is much too low a level to be solving the problem at. If we have to convert path encodings in the backend, we should be doing it once somewhere around the place where we identify the value of PGDATA. It should not be necessary to repeat all this for every file access within the database directory. Ahh, I think this is a sensitive problem and requires careful handling too. However, following tests are shown in order to help your understanding. This is the case which can't be operated if no apply the patch of Itagaki-san. C:\work>set PGDATA=C:\tmp\日本語 data C:\work>set PGPORT=5444 C:\work>set PGHOME=C:\MinGW\local\pgsql C:\work>cmd.exe Microsoft Windows XP [Version 5.1.2600] (C) Copyright 1985-2001 Microsoft Corp. C:\work>initdb -E UTF-8 --no-locale データベースシステム内のファイルの所有者は"HIROSHI"ユーザでした。 このユーザがサーバプロセスを所有しなければなりません。 データベースクラスタはロケールCで初期化されます。 デフォルトのテキスト検索設定はenglishに設定されました。 ディレクトリC:/tmp/日本語 dataの権限を設定しています ... ok サブディレクトリを作成しています ... ok デフォルトのmax_connectionsを選択しています ... 100 デフォルトの shared_buffers を選択しています ... 32MB 設定ファイルを作成しています ... ok C:/tmp/日本語 data/base/1にtemplate1データベースを作成しています ... ok pg_authidを初期化しています ... ok 依存関係を初期化しています ... ok システムビューを作成しています ... ok システムオブジェクトの定義をロードしています ... ok 変換を作成しています ... ok ディレクトリを作成しています ... ok 組み込みオブジェクトに権限を設定しています ... ok 情報スキーマを作成しています ... ok template1データベースをバキュームしています ... ok template1からtemplate0へコピーしています ... ok template1からpostgresへコピーしています ... ok 警告: ローカル接続向けに"trust"認証が有効です。 pg_hba.confを編集する、もしくは、次回initdbを実行する時に-Aオプショ ンを使用することで変更することができます。 成功しました。以下を使用してデータベースサーバを起動することができます。 "postmaster" -D "C:/tmp/日本語 data" または "pg_ctl" -D "C:/tmp/日本語 data" -l logfile start C:\work>set PGCLIENTENCODING=SJIS C:\work>psql postgres psql (8.4beta1) "help" でヘルプを表示します. postgres=# create table 日本語(きー text); CREATE TABLE postgres=# insert into 日本語 values('いれた'); INSERT 0 1 postgres=# copy 日本語 to 'C:/tmp/日本語 data/日本語utf8.txt'; COPY 1 postgres=# delete from 日本語; DELETE 1 postgres=# copy 日本語 from 'C:/tmp/日本語 data/日本語utf8.txt'; COPY 1 postgres=# select * from 日本語; きー いれた (1 行) C:\work>dir "C:\tmp\日本語 data" ドライブ C のボリューム ラベルは SYS です ボリューム シリアル番号は 1433-2C7C です C:\tmp\日本語 data のディレクトリ 2009/04/13 23:22 . 2009/04/13 23:22 .. 2009/04/13 23:18 base 2009/04/13 23:19 global 2009/04/13 23:17 pg_clog 2009/04/13 23:17 3,616 pg_hba.conf 2009/04/13 23:17 1,611 pg_ident.conf 2009/04/13 23:17 pg_multixact 2009/04/13 23:23 pg_stat_tmp 2009/04/13 23:17 pg_subtrans 2009/04/13 23:17 pg_tblspc 2009/04/13 23:17 pg_twophase 2009/04/13 23:17 4 PG_VERSION 2009/04/13 23:17 pg_xlog 2009/04/13 23:1717,112 postgresql.conf 2009/04/13 23:1938 postmaster.opts 2009/04/13 23:1924 postmaster.pid 2009/04/13 23:22 8 日本語utf8.txt 7 個のファイル 22,413 バイト 11 個のディレクトリ 42,780,246,016 バイトの空き領域 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Add a test for pg_get_functiondef()
Abhijit Menon-Sen writes: > [ a test whose purpose he didn't bother to describe ] What is the value of this? It seems far more likely to cause maintenance pain than to catch anything interesting. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Solution of the file name problem of copy on windows.
Itagaki Takahiro writes: > Here is a patch to implement GetPlatformEncoding() and convert absolute > file paths from database encoding to platform encoding. This seems like a fairly significant overhead added to solve a really minor problem (if it's not minor why has it never come up before?). I'm also not convinced by any of the details --- why are GetACP and pg_get_encoding_from_locale the things to look at, and why is fd.c an appropriate place to hook in? Surely if we need it here, we need it in places like initdb as well. But really this is much too low a level to be solving the problem at. If we have to convert path encodings in the backend, we should be doing it once somewhere around the place where we identify the value of PGDATA. It should not be necessary to repeat all this for every file access within the database directory. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] New trigger option of pg_standby
Hi, On Mon, Apr 13, 2009 at 7:21 PM, Guillaume Smet wrote: > On Mon, Apr 13, 2009 at 7:52 AM, Fujii Masao wrote: >> 1. the trigger file containing "smart" is created. >> 2. pg_standby is executed. >> 2-1. nextWALfile is restored. >> 2-2. the trigger file is deleted because nextWALfile+1 doesn't exist. >> 3. the restored nextWALfile is applied. >> 4. pg_standby is executed again to restore nextWALfile+1. > > I don't think it should happen. IMHO, it's an acceptable compromise to > replay all the WAL files present when I created the trigger file. So > if I have the smart shutdown trigger file and I don't have any > nextWALfile+1, I can remove the trigger file and stop the recovery: > pg_standby won't be executed again after that, even if a nextWALfile+1 > appeared while replaying the previous WAL file. The scenario which I described is not related to whether the nextWALfile+1 exists or not. To clarify the detail of it; If pg_standby restores nextWALfile, deletes the trigger file and exits with 1 (i.e. tell the end of recovery to the startup process), the startup process considers that pg_standby failed, and tries to read the nextWALfile in pg_xlog instead of the restored file named "RECOVERYXLOG". This is undesirable behavior because some transactions would be lost if nextWALfile in pg_xlog doesn't exist. So, exit(0) should be called when nextWALfile exists. On the other hand, if pg_standby restores the nextWALfile, deletes the trigger file and calls exit(0), the startup process replays the restored file and tries to read the nextWALfile+1 because it doesn't know if the nextWALfile is the last valid WAL file. So, pg_standby may be executed again even after the trigger file is deleted. Am I missing something? > That said, stupid question: do we have a way to know the nextWALfile+1 > name to test if it exists? nextWALfile is transmitted through the > restore_command API and I'm wondering if we can have nextWALfile+1 > name without changing the restore_command API. Probably Yes; the following three steps are required, I think. - Get the timeline, logid and segid from the name of the nextWALfile. - Increment the logid and segid pair using NextLogSeg macro. - Calculate the name of the nextWALfile+1 using XLogFileName macro. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] New trigger option of pg_standby
On Mon, Apr 13, 2009 at 7:52 AM, Fujii Masao wrote: > 1. the trigger file containing "smart" is created. > 2. pg_standby is executed. >2-1. nextWALfile is restored. >2-2. the trigger file is deleted because nextWALfile+1 doesn't exist. > 3. the restored nextWALfile is applied. > 4. pg_standby is executed again to restore nextWALfile+1. I don't think it should happen. IMHO, it's an acceptable compromise to replay all the WAL files present when I created the trigger file. So if I have the smart shutdown trigger file and I don't have any nextWALfile+1, I can remove the trigger file and stop the recovery: pg_standby won't be executed again after that, even if a nextWALfile+1 appeared while replaying the previous WAL file. That said, stupid question: do we have a way to know the nextWALfile+1 name to test if it exists? nextWALfile is transmitted through the restore_command API and I'm wondering if we can have nextWALfile+1 name without changing the restore_command API. -- Guillaume -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Solution of the file name problem of copy on windows.
Itagaki Takahiro wrote: > "Hiroshi Saito" wrote: > > > Um, I had a focus in help the problem which is not avoided. > > I am not sensitive to a problem being avoided depending on usage. > > However, I will wish to work spontaneously, when it is help much. > > I'll research whether encoding of filesystem path is affected by > locale settings or not in some platforms. Also, we need to research > where we should get the system encoding when the locale is set to "C", > which is popular in Japanese users. Here is a patch to implement GetPlatformEncoding() and convert absolute file paths from database encoding to platform encoding. Since encoding of paths are converted at AllocateFile() and BasicOpenFile(), not only COPY TO/FROM but also almost of file operations are covered by the patch. Callers of file access methods don't have to modify their codes. Please test the patch in a variety of platforms. I tested it on Windows and Linux, and then I found {PG_UTF8, "ANSI_X3.4-1968"} is required for encoding_match_list in src/port/chklocale.c on Linux (FC6). Regards, --- ITAGAKI Takahiro NTT Open Source Software Center GetPlatformEncoding.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Add a test for pg_get_functiondef()
Sorry, I screwed up a little in sending that patch. Here it is again as an attachment. -- ams diff --git a/src/test/regress/sql/defs.sql b/src/test/regress/sql/defs.sql new file mode 100644 index 000..cf8fff3 --- /dev/null +++ b/src/test/regress/sql/defs.sql @@ -0,0 +1,24 @@ +-- Test pg_get_functiondef() + +CREATE SCHEMA foo; +SET search_path = public,foo,pg_catalog; +CREATE DOMAIN foo."evil domain" as text; +CREATE DOMAIN foo."date" as text; + +CREATE FUNCTION "$$evil"(out foo integer, inout bar date, in "evil domain", in anyelement) + returns setof record stable strict security definer cost 123 rows 2 + language plpgsql as +$f$ + declare r record; + begin +for r in select * from "$$evil"(null,null,null) loop + foo := r.foo; + bar := r.bar; + return next; +end loop; + end; +$f$; + +SELECT pg_get_functiondef('$$evil'::regproc::oid); + +DROP SCHEMA foo CASCADE; diff --git a/src/test/regress/expected/defs.out b/src/test/regress/expected/defs.out new file mode 100644 index 000..5d0670a --- /dev/null +++ b/src/test/regress/expected/defs.out @@ -0,0 +1,43 @@ +-- Test pg_get_functiondef() +CREATE SCHEMA foo; +SET search_path = public,foo,pg_catalog; +CREATE DOMAIN foo."evil domain" as text; +CREATE DOMAIN foo."date" as text; +CREATE FUNCTION "$$evil"(out foo integer, inout bar date, in "evil domain", in anyelement) + returns setof record stable strict security definer cost 123 rows 2 + language plpgsql as +$f$ + declare r record; + begin +for r in select * from "$$evil"(null,null,null) loop + foo := r.foo; + bar := r.bar; + return next; +end loop; + end; +$f$; +SELECT pg_get_functiondef('$$evil'::regproc::oid); + pg_get_functiondef + + CREATE OR REPLACE FUNCTION public."$$evil"(OUT foo integer, INOUT bar date, "evil domain", anyelement) + RETURNS SETOF record + LANGUAGE plpgsql + STABLE STRICT SECURITY DEFINER COST 123 ROWS 2 + AS $function$ + declare r record; + begin + for r in select * from "$$evil"(null,null,null) loop + foo := r.foo; + bar := r.bar; + return next; + end loop; + end; + $function$ + +(1 row) + +DROP SCHEMA foo CASCADE; +NOTICE: drop cascades to 3 other objects +DETAIL: drop cascades to type "evil domain" +drop cascades to type date +drop cascades to function "$$evil"(date,"evil domain",anyelement) diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule index 3b1d843..f1e3dbc 100644 --- a/src/test/regress/parallel_schedule +++ b/src/test/regress/parallel_schedule @@ -77,7 +77,7 @@ test: misc # -- # Another group of parallel tests # -- -test: select_views portals_p2 rules foreign_key cluster dependency guc bitmapops combocid tsearch tsdicts foreign_data window +test: select_views portals_p2 rules foreign_key cluster dependency guc bitmapops combocid tsearch tsdicts foreign_data window defs # -- # Another group of parallel tests diff --git a/src/test/regress/serial_schedule b/src/test/regress/serial_schedule index b7984ed..5dca5d3 100644 --- a/src/test/regress/serial_schedule +++ b/src/test/regress/serial_schedule @@ -121,3 +121,4 @@ test: with test: xml test: stats test: tablespace +test: defs -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers