Re: [HACKERS] ltree PostgreSQL Module
On Fri, 12 Nov 2004, John Hansen wrote: I miss about UTF-8 :) ltree doesn't supports UTF-8 yet. ok,. how about all the 'other' characters from us-ascii : ,[EMAIL PROTECTED]&*()_+-=[]{}\|'"?><`~ these 'should' all be valid for the ltxtquery, ltree, and ltree[] types, except maybe for . which is used as seperator (and maybe . should be valid too, if prepended with a '\', just as you would with a regex to make the next character a literal). I agree with you, but Unfortunately, it's not easy and we have no time to work on ltree now. ... John Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [HACKERS] multiline CSV fields
Patrick B Kelly <[EMAIL PROTECTED]> writes: > I may not be explaining myself well or I may fundamentally > misunderstand how copy works. Well, you're definitely ignoring the character-set-conversion issue. regards, tom lane ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [HACKERS] multiline CSV fields
Can I see an example of such a failure line? --- Andrew Dunstan wrote: > > Darcy Buskermolen has drawn my attention to unfortunate behaviour of > COPY CSV with fields containing embedded line end chars if the embedded > sequence isn't the same as those of the file containing the CSV data. In > that case we error out when reading the data in. This means there are > cases where we can produce a CSV data file which we can't read in, which > is not at all pleasant. > > Possible approaches to the problem: > . make it a documented limitation > . have a "csv read" mode for backend/commands/copy.c:CopyReadLine() that > relaxes some of the restrictions on inconsistent line endings > . escape embedded line end chars > > The last really isn't an option, because the whole point of CSVs is to > play with other programs, and my understanding is that those that > understand multiline fields (e.g. Excel) expect them not to be escaped, > and do not produce them escaped. > > So right now I'm tossing up in my head between the first two options. Or > maybe there's another solution I haven't thought of. > > Thoughts? > > cheers > > andrew > > ---(end of broadcast)--- > TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED] > -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] newbie compile question. please help
Problem solved. I added some printf statements in order do print some debug information. The tools like pg_ctl always run postmaster -V in order to get the version number. The printf statements that I added, printed extra info which pg_ctl could not "understand". Therefore I got the error. Regards, Gevik -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Dave Cramer Sent: Friday, November 12, 2004 1:01 AM To: [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Subject: Re: [HACKERS] newbie compile question. please help Check the path of pg_ctl and postmaster, you must have two versions on the machine somewhere. Dave [EMAIL PROTECTED] wrote: >Dear Folks, > >Could someone please tell me what I am doing wrong here? > >I made some minor changes to >main.c then >make then >make install > >when I want to start the postmaster I get > >** >The program "postmaster" was found by >"/home/gevik/postgres/build2/bin/pg_ctl" >but was not the same version as pg_ctl. >Check your installation. >* > >What am I doing wrong here? > >Regards, > >Gevik > > >---(end of broadcast)--- >TIP 6: Have you searched our list archives? > > http://archives.postgresql.org > > > > -- Dave Cramer http://www.postgresintl.com 519 939 0336 ICQ#14675561 ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [HACKERS] A modest proposal: get rid of GUC's USERLIMIT
Andrew McMillan wrote: -- Start of PGP signed section. > On Wed, 2004-11-10 at 11:45 -0500, Tom Lane wrote: > > Andrew McMillan <[EMAIL PROTECTED]> writes: > > > When tracking down gnarly problems in heavily multi-user applications > > > enabling higher log levels at selective points has the potential to help > > > _a lot_ with diagnostic detail, without smothering you in _every_ > > > detail. > > > > Sure. As I pointed out in the other thread, if you want to allow an app > > to do this, you can make available a SECURITY DEFINER function that > > performs the desired SET on its behalf. By setting execute permissions > > on the function and/or including restrictions in the function's code, > > you can make this as tight or as loose a loophole as you like. So it's > > certainly possible to do what you want in any case. I think the issue > > at hand is what's appropriate to provide as hard-wired functionality. > > That sounds excellent - I hadn't realised that this workaround would be > possible, and indeed with this in place that will provide even better > control over the facility. OK, here is one vote for the ALTER USER/remove USERLIMIT croud, and you were the person who originally mentioned the problem. You don't think the function creation is hard. Perhaps that's the way to go then. -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] GUC custom variables broken
Thomas Hallgren wrote: > Tom Lane wrote: > > I broke it :-(. Fix committed. > > > Thanks, that was quick. You fixed it before I managed to find it. > > Not that I'm in an extreme hurry, just curious. Are we using different > CVS repositories with some latency in replication or something? I don't > seem to get your fix yet. Anonymous CVS has a lag. I think it is 1/2 hour or an hour. -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] multiline CSV fields
On Nov 11, 2004, at 10:07 PM, Andrew Dunstan wrote: Patrick B Kelly wrote: My suggestion is to simply have CopyReadLine recognize these two states (in-field and out-of-field) and execute the current logic only while in the second state. It would not be too hard but as you mentioned it is non-trivial. We don't know what state we expect the end of line to be in until after we have actually read the line. To know how to treat the end of line on your scheme we would have to parse as we go rather than after reading the line as now. Changing this would be not only be non-trivial but significantly invasive to the code. Perhaps I am misunderstanding the code. As I read it the code currently goes through the input character by character looking for NL and EOF characters. It appears to be very well structured for what I am proposing. The section in question is a small and clearly defined loop which reads the input one character at a time and decides when it has reached the end of the line or file. Each call of CopyReadLine attempts to get one more line. I would propose that each time it starts out in the out-of-field state and the state is toggled by each un-escaped quote that it encounters in the stream. When in the in-field state, it would only look for the next un-escaped quote and while in the out-of-field state, it would execute the existing logic as well as looking for the next un-escaped quote. I may not be explaining myself well or I may fundamentally misunderstand how copy works. I would be happy to code the change and send it to you for review, if you would be interested in looking it over and it is felt to be a worthwhile capability. Patrick B. Kelly -- http://patrickbkelly.org ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] multiline CSV fields
Patrick B Kelly wrote: My suggestion is to simply have CopyReadLine recognize these two states (in-field and out-of-field) and execute the current logic only while in the second state. It would not be too hard but as you mentioned it is non-trivial. We don't know what state we expect the end of line to be in until after we have actually read the line. To know how to treat the end of line on your scheme we would have to parse as we go rather than after reading the line as now. Changing this would be not only be non-trivial but significantly invasive to the code. cheers andrew ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [HACKERS] multiline CSV fields
On Nov 11, 2004, at 6:16 PM, Tom Lane wrote: Patrick B Kelly <[EMAIL PROTECTED]> writes: What about just coding a FSM into backend/commands/copy.c:CopyReadLine() that does not process any flavor of NL characters when it is inside of a data field? CopyReadLine has no business tracking that. One reason why not is that it is dealing with data not yet converted out of the client's encoding, which makes matching to user-specified quote/escape characters difficult. regards, tom lane ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings I appreciate what you are saying about the encoding and you are, of course, right but CopyReadLine is already processing the NL characters and it is doing it without considering the context in which they appear. Unfortunately, the same character(s) are used for two different purposes in the files in question. Without considering whether they appear inside or outside of data fields, CopyReadline will mistake one for the other and cannot correctly do what it is already trying to do which is break the input file into lines. My suggestion is to simply have CopyReadLine recognize these two states (in-field and out-of-field) and execute the current logic only while in the second state. It would not be too hard but as you mentioned it is non-trivial. Patrick B. Kelly -- http://patrickbkelly.org ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [HACKERS] MAX/MIN optimization via rewrite (plus query rewrites generally)
Tom Lane <[EMAIL PROTECTED]> writes: > Greg Stark <[EMAIL PROTECTED]> writes: > > Tom Lane <[EMAIL PROTECTED]> writes: > >> Oh? How is a first() aggregate going to know what sort order you want > >> within the group? > > > It would look something like > > > select x,first(a),first(b) from (select x,a,b from table order by x,y) > > group by x > > > which is equivalent to > > > select DISTINCT ON (x) x,a,b from table ORDER BY x,y > > No, it is not. The GROUP BY has no commitment to preserve order --- > consider for example the possibility that we implement the GROUP BY by > hashing. It doesn't matter how the group by is implemented. The only thing that matters is what order the tuples are presented to the aggregate function. Even hashing presents them to the aggregate function as they're read in from the subquery. In fact iirc hashing was why the original user that asked for this behaviour found that it sometimes did work and wondered why it didn't work when the planner *didn't* use hashing. You added code to allow it to function properly for that case as well. > The fact that an optimization is present does not make it part of the > guaranteed semantics of the language. > > Basically, first() is a broken concept in SQL. Of course DISTINCT ON > is broken too for the same reasons, but I do not see that first() is > one whit less of a kluge than DISTINCT ON. first() is only a whit less of a kludge than DISTINCT ON in that it covers the more general case of wanting both first value for each group as well as other aggregates. Otherwise it's exactly equivalent. Depending on ordering was a broken concept in the abstract theoretical world of SQL as originally envisioned. But (unfortunately) we don't really live in that world any more. There are tons of extensions that Postgres and other databases provide that do depend on the ordering of record sets. And there are tons of examples of real business problems that require those extensions. It may be a broken concept but it's one that's extremely popular. Postgres implemented DISTINCT ON independently, but mysql also has an equivalent feature: you can include any column not included in the GROUP BY clause and mysql will implicitly do the equivalent of first(). Oracle on the other hand has a complete suite of functions for processing order-dependent data. They even implement lead()/lag() for allowing you to access the data from subsequent or previous records in the data set. -- greg ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [HACKERS] GUC custom variables broken
Tom Lane wrote: I broke it :-(. Fix committed. Thanks, that was quick. You fixed it before I managed to find it. Not that I'm in an extreme hurry, just curious. Are we using different CVS repositories with some latency in replication or something? I don't seem to get your fix yet. Regards, Thomas Hallgren ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] newbie compile question. please help
Check the path of pg_ctl and postmaster, you must have two versions on the machine somewhere. Dave [EMAIL PROTECTED] wrote: Dear Folks, Could someone please tell me what I am doing wrong here? I made some minor changes to main.c then make then make install when I want to start the postmaster I get ** The program "postmaster" was found by "/home/gevik/postgres/build2/bin/pg_ctl" but was not the same version as pg_ctl. Check your installation. * What am I doing wrong here? Regards, Gevik ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org -- Dave Cramer http://www.postgresintl.com 519 939 0336 ICQ#14675561 ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] GUC custom variables broken
Thomas Hallgren wrote: I have the following in my postgresql.conf custom_variable_classes = 'pljava' pljava.classpath = 'C:\\Tada\\Workspace\\org.postgresql.pljava\\build\\pljava.jar' It worked fine with 8.0.0beta2. The beta4 however, gives me the following message: FATAL: unrecognized configuration parameter "pljava.classpath" What happened? > Ah, I see what's wrong. It's in the guc_file.l Previously, the "custom_variable_classes" was *set* first of all variables. Now, it's moved first in the linked list and then *tested* first of all. Subsequent tests of custom variables will of course then fail since the custom_variable_classes has not been set. Perhaps this change could be reverted? Or perhaps someone has other preferences on implementation? Regards, Thomas Hallgren ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] GUC custom variables broken
Thomas Hallgren <[EMAIL PROTECTED]> writes: > What happened? I broke it :-(. Fix committed. regards, tom lane ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] A modest proposal: get rid of GUC's USERLIMIT
On Wed, 2004-11-10 at 11:45 -0500, Tom Lane wrote: > Andrew McMillan <[EMAIL PROTECTED]> writes: > > When tracking down gnarly problems in heavily multi-user applications > > enabling higher log levels at selective points has the potential to help > > _a lot_ with diagnostic detail, without smothering you in _every_ > > detail. > > Sure. As I pointed out in the other thread, if you want to allow an app > to do this, you can make available a SECURITY DEFINER function that > performs the desired SET on its behalf. By setting execute permissions > on the function and/or including restrictions in the function's code, > you can make this as tight or as loose a loophole as you like. So it's > certainly possible to do what you want in any case. I think the issue > at hand is what's appropriate to provide as hard-wired functionality. That sounds excellent - I hadn't realised that this workaround would be possible, and indeed with this in place that will provide even better control over the facility. Regards, Andrew. - Andrew @ Catalyst .Net .NZ Ltd, PO Box 11-053, Manners St, Wellington WEB: http://catalyst.net.nz/PHYS: Level 2, 150-154 Willis St DDI: +64(4)803-2201 MOB: +64(272)DEBIAN OFFICE: +64(4)499-2267 The secret of being a bore is to say everything -- Voltaire - signature.asc Description: This is a digitally signed message part
Re: [HACKERS] multiline CSV fields
Patrick B Kelly <[EMAIL PROTECTED]> writes: > What about just coding a FSM into > backend/commands/copy.c:CopyReadLine() that does not process any flavor > of NL characters when it is inside of a data field? CopyReadLine has no business tracking that. One reason why not is that it is dealing with data not yet converted out of the client's encoding, which makes matching to user-specified quote/escape characters difficult. regards, tom lane ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [HACKERS] multiline CSV fields
Patrick B Kelly wrote: What about just coding a FSM into backend/commands/copy.c:CopyReadLine() that does not process any flavor of NL characters when it is inside of a data field? It would be a major change - the routine doesn't read data a field at a time, and has no idea if we are even in CSV mode at all. It would be rather late in the dev cycle to be making such changes, I suspect. cheers andrew ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
[HACKERS] GUC custom variables broken
I have the following in my postgresql.conf custom_variable_classes = 'pljava' pljava.classpath = 'C:\\Tada\\Workspace\\org.postgresql.pljava\\build\\pljava.jar' It worked fine with 8.0.0beta2. The beta4 however, gives me the following message: FATAL: unrecognized configuration parameter "pljava.classpath" What happened? Regards, Thomas Hallgren ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] multiline CSV fields
On Nov 11, 2004, at 2:56 PM, Andrew Dunstan wrote: Tom Lane wrote: Andrew Dunstan <[EMAIL PROTECTED]> writes: Patrick B Kelly wrote: Actually, when I try to export a sheet with multi-line cells from excel, it tells me that this feature is incompatible with the CSV format and will not include them in the CSV file. It probably depends on the version. I have just tested with Excel 2000 on a WinXP machine and it both read and wrote these files. I'd be inclined to define Excel 2000 as broken, honestly, if it's writing unescaped newlines as data. To support this would mean throwing away most of our ability to detect incorrectly formatted CSV files. A simple error like a missing close quote would look to the machine like the rest of the file is a single long data line where all the newlines are embedded in data fields. How likely is it that you'll get a useful error message out of that? Most likely the error message would point to the end of the file, or at least someplace well removed from the actual mistake. I would vote in favor of removing the current code that attempts to support unquoted newlines, and waiting to see if there are complaints. This feature was specifically requested when we discussed what sort of CSVs we would handle. And it does in fact work as long as the newline style is the same. I just had an idea. How about if we add a new CSV option MULTILINE. If absent, then on output we would not output unescaped LF/CR characters and on input we would not allow fields with embedded unescaped LF/CR characters. In both cases we could error out for now, with perhaps an 8.1 TODO to provide some other behaviour. Or we could drop the whole multiline "feature" for now and make the whole thing an 8.1 item, although it would be a bit of a pity when it does work in what will surely be the most common case. What about just coding a FSM into backend/commands/copy.c:CopyReadLine() that does not process any flavor of NL characters when it is inside of a data field? Patrick B. Kelly -- http://patrickbkelly.org ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] MAX/MIN optimization via rewrite (plus query rewrites generally)
Greg Stark <[EMAIL PROTECTED]> writes: > Tom Lane <[EMAIL PROTECTED]> writes: >> Oh? How is a first() aggregate going to know what sort order you want >> within the group? > It would look something like > select x,first(a),first(b) from (select x,a,b from table order by x,y) group > by x > which is equivalent to > select DISTINCT ON (x) x,a,b from table ORDER BY x,y No, it is not. The GROUP BY has no commitment to preserve order --- consider for example the possibility that we implement the GROUP BY by hashing. > The group by can see that the subquery is already sorted by x and > doesn't need to be resorted. In fact I believe you added the smarts to > detect that condition in response to a user asking about precisely > this type of scenario. The fact that an optimization is present does not make it part of the guaranteed semantics of the language. Basically, first() is a broken concept in SQL. Of course DISTINCT ON is broken too for the same reasons, but I do not see that first() is one whit less of a kluge than DISTINCT ON. regards, tom lane ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] MAX/MIN optimization via rewrite (plus query rewrites generally)
Tom Lane <[EMAIL PROTECTED]> writes: > Greg Stark <[EMAIL PROTECTED]> writes: > > It would also make it possible to deprecate DISTINCT ON in favour of GROUP > > BY > > with first() calls. > > Oh? How is a first() aggregate going to know what sort order you want > within the group? AFAICS first() is only useful when you honestly do > not care which group member you get ... which is certainly not the case > for applications of DISTINCT ON. It would look something like select x,first(a),first(b) from (select x,a,b from table order by x,y) group by x which is equivalent to select DISTINCT ON (x) x,a,b from table ORDER BY x,y The group by can see that the subquery is already sorted by x and doesn't need to be resorted. In fact I believe you added the smarts to detect that condition in response to a user asking about precisely this type of scenario. This is actually more general than DISTINCT ON since DISTINCT ON is basically a degenerate case of the above where the _only_ aggregate allowed is first(). The more general case could have first() as well as other aggregates, though obviously they would make it unlikely that any optimizations would be applicable. I do kind of like the DISTINCT ON syntax, but the inability to use any other aggregate functions makes me often have to convert queries I originally wrote to use it to use the more general GROUP BY and first() instead. -- greg ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [HACKERS] multiline CSV fields
On Thu, Nov 11, 2004 at 03:38:16PM -0500, Greg Stark wrote: > > Tom Lane <[EMAIL PROTECTED]> writes: > > > I would vote in favor of removing the current code that attempts > > to support unquoted newlines, and waiting to see if there are > > complaints. > > Uhm. *raises hand* > > I agree with your argument but one way or another I have to load > these CSVs I'm given. And like it or not virtually all the CSVs > people get are going to be coming from Excel. So far with 7.4 I've > just opened them up in Emacs and removed the newlines, but it's a > royal pain in the arse. Meanwhile, check out dbi-link. It lets you query against DBI data sources including DBD::Excel :) http://pgfoundry.org/projects/dbi-link/ Bug reports welcome. Cheers, D -- David Fetter [EMAIL PROTECTED] http://fetter.org/ phone: +1 510 893 6100 mobile: +1 415 235 3778 Remember to vote! ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [HACKERS] ltree PostgreSQL Module
> I miss about UTF-8 :) ltree doesn't supports UTF-8 yet. ok,. how about all the 'other' characters from us-ascii : ,[EMAIL PROTECTED]&*()_+-=[]{}\|'"?><`~ these 'should' all be valid for the ltxtquery, ltree, and ltree[] types, except maybe for . which is used as seperator (and maybe . should be valid too, if prepended with a '\', just as you would with a regex to make the next character a literal). ... John ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [HACKERS] multiline CSV fields
Tom Lane <[EMAIL PROTECTED]> writes: > I would vote in favor of removing the current code that attempts to > support unquoted newlines, and waiting to see if there are complaints. Uhm. *raises hand* I agree with your argument but one way or another I have to load these CSVs I'm given. And like it or not virtually all the CSVs people get are going to be coming from Excel. So far with 7.4 I've just opened them up in Emacs and removed the newlines, but it's a royal pain in the arse. -- greg ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] multiline CSV fields
Tom Lane wrote: Andrew Dunstan <[EMAIL PROTECTED]> writes: Patrick B Kelly wrote: Actually, when I try to export a sheet with multi-line cells from excel, it tells me that this feature is incompatible with the CSV format and will not include them in the CSV file. It probably depends on the version. I have just tested with Excel 2000 on a WinXP machine and it both read and wrote these files. I'd be inclined to define Excel 2000 as broken, honestly, if it's writing unescaped newlines as data. To support this would mean throwing away most of our ability to detect incorrectly formatted CSV files. A simple error like a missing close quote would look to the machine like the rest of the file is a single long data line where all the newlines are embedded in data fields. How likely is it that you'll get a useful error message out of that? Most likely the error message would point to the end of the file, or at least someplace well removed from the actual mistake. I would vote in favor of removing the current code that attempts to support unquoted newlines, and waiting to see if there are complaints. This feature was specifically requested when we discussed what sort of CSVs we would handle. And it does in fact work as long as the newline style is the same. I just had an idea. How about if we add a new CSV option MULTILINE. If absent, then on output we would not output unescaped LF/CR characters and on input we would not allow fields with embedded unescaped LF/CR characters. In both cases we could error out for now, with perhaps an 8.1 TODO to provide some other behaviour. Or we could drop the whole multiline "feature" for now and make the whole thing an 8.1 item, although it would be a bit of a pity when it does work in what will surely be the most common case. cheers andrew ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [HACKERS] MAX/MIN optimization via rewrite (plus query rewrites generally)
Alvaro Herrera <[EMAIL PROTECTED]> writes: > Re: knowing internal representation, I think this is required anyway; > else the optimization would only work on a very limited numer of > situations. The point of my remark is that pushing this knowledge out to a function is helpful only if you can put that function somehow at arm's length from the main body of the optimizer. Otherwise structuring it like that is useless obscurantism. regards, tom lane ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [HACKERS] MAX/MIN optimization via rewrite (plus query rewrites generally)
Greg Stark <[EMAIL PROTECTED]> writes: > It would also make it possible to deprecate DISTINCT ON in favour of GROUP BY > with first() calls. Oh? How is a first() aggregate going to know what sort order you want within the group? AFAICS first() is only useful when you honestly do not care which group member you get ... which is certainly not the case for applications of DISTINCT ON. regards, tom lane ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] multiline CSV fields
Andrew Dunstan <[EMAIL PROTECTED]> writes: > Patrick B Kelly wrote: >> Actually, when I try to export a sheet with multi-line cells from >> excel, it tells me that this feature is incompatible with the CSV >> format and will not include them in the CSV file. > It probably depends on the version. I have just tested with Excel 2000 > on a WinXP machine and it both read and wrote these files. I'd be inclined to define Excel 2000 as broken, honestly, if it's writing unescaped newlines as data. To support this would mean throwing away most of our ability to detect incorrectly formatted CSV files. A simple error like a missing close quote would look to the machine like the rest of the file is a single long data line where all the newlines are embedded in data fields. How likely is it that you'll get a useful error message out of that? Most likely the error message would point to the end of the file, or at least someplace well removed from the actual mistake. I would vote in favor of removing the current code that attempts to support unquoted newlines, and waiting to see if there are complaints. regards, tom lane ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] newbie compile question. please help
[EMAIL PROTECTED] writes: > I made some minor changes to > main.c then > make then > make install > when I want to start the postmaster I get > The program "postmaster" was found by > "/home/gevik/postgres/build2/bin/pg_ctl" > but was not the same version as pg_ctl. Maybe you did something that broke recognition of the "-V" command-line option? What happens when you do "postmaster -V"? regards, tom lane ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [HACKERS] MAX/MIN optimization via rewrite (plus query
On Thu, Nov 11, 2004 at 09:29:14 +, Simon Riggs <[EMAIL PROTECTED]> wrote: > On Wed, 2004-11-10 at 22:48, Mark Kirkwood wrote: > > Planning for future note: I would like whatever mechanism that is added > > for this MAX/MIN stuff to be amenable to more subtle things like > > aggregate navigation (see R.Kimball's article > > http://www.dbmsmag.com/9608d54.html). > > > > With you on that one... I disaggree. What that author is suggesting is orthogonal to what is being proposed by Mark. That article is really suggesting a varient of materialized views where you can use the normal aggregate notation instead of having to use a special column name to get access to an aggregate in the materialized view. ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] MAX/MIN optimization via rewrite (plus query rewrites generally)
On Thu, Nov 11, 2004 at 10:24:34 -0500, Tom Lane <[EMAIL PROTECTED]> wrote: > > We know how to determine that an index matches an ORDER BY clause. > But what has an aggregate called MAX() got to do with ORDER BY? Magic > assumptions about operators named "<" are not acceptable answers; there > has to be a traceable connection in the catalogs. > > As a real-world example of why I won't hold still for hard-wiring this: > a complex-number data type might have btree opclasses allowing it to be > sorted either by real part or by absolute value. One might then define > max_real() and max_abs() aggregates on the type. It should be possible > to optimize such aggregates the same way as any other max() aggregate. Wouldn't knowing an opclass and direction associated with an aggregrate function allow you to do this? ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
[HACKERS] newbie compile question. please help
Dear Folks, Could someone please tell me what I am doing wrong here? I made some minor changes to main.c then make then make install when I want to start the postmaster I get ** The program "postmaster" was found by "/home/gevik/postgres/build2/bin/pg_ctl" but was not the same version as pg_ctl. Check your installation. * What am I doing wrong here? Regards, Gevik ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] MAX/MIN optimization via rewrite (plus query rewrites generally)
Tom Lane <[EMAIL PROTECTED]> writes: > As a real-world example of why I won't hold still for hard-wiring this: > a complex-number data type might have btree opclasses allowing it to be > sorted either by real part or by absolute value. One might then define > max_real() and max_abs() aggregates on the type. It should be possible > to optimize such aggregates the same way as any other max() aggregate. So if the max_real() aggregate had a field that indicated that max_real(x) could be satisfied with only the first record from the dataset as long as it's sorted by "real(x)" that would be enough information. The optimizer would still have a lot of work to combine this information for all aggregates used and check the costs for providing sorted result sets to the scan. There would also need new scans that could handle reading just one record and then skipping to the next group. It's a lot of work but it would make a lot of aggregates a lot more useful. It would also make it possible to deprecate DISTINCT ON in favour of GROUP BY with first() calls. -- greg ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] MAX/MIN optimization via rewrite (plus query rewrites generally)
Bruno Wolff III <[EMAIL PROTECTED]> writes: > Tom Lane <[EMAIL PROTECTED]> wrote: >> We know how to determine that an index matches an ORDER BY clause. >> But what has an aggregate called MAX() got to do with ORDER BY? > Wouldn't knowing an opclass and direction associated with an aggregrate > function allow you to do this? That's one way you could do it. Another possibly cleaner way is to just supply a sort operator (with the implication "the desired value is the one ordered first by this operator") and then let the optimizer work out which opclass(es) are relevant. This would win if the same operator appears in different opclasses, which is uncommon at the moment but would not be so if we start offering "reverse sort" opclasses. I think we covered all this ground before, though --- have you checked the archives from the last time this was discussed? regards, tom lane ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [HACKERS] MAX/MIN optimization via rewrite (plus query
Simon Riggs <[EMAIL PROTECTED]> writes: > On Wed, 2004-11-10 at 22:48, Mark Kirkwood wrote: > > Planning for future note: I would like whatever mechanism that is added > > for this MAX/MIN stuff to be amenable to more subtle things like > > aggregate navigation (see R.Kimball's article > > http://www.dbmsmag.com/9608d54.html). > > > > With you on that one... This looks like more of a materialized views application than anything to do with planning aggregate functions. Materialized views are most useful when the materialized view is smaller than the original data and therefore usually involve aggregates, but it's not necessary. -- greg ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [HACKERS] MAX/MIN optimization via rewrite (plus query rewrites generally)
On Thu, Nov 11, 2004 at 17:52:19 +1100, John Hansen <[EMAIL PROTECTED]> wrote: > Why not just change the function all together to 'select $1 from $2 > order by $1 desc limit 1;' > > Is there ANY situation where max(col) as it is, would be faster? Yes. A couple I can think of are: When count(col) is also being used. When a GROUP BY is being used and there isn't an index that can both be used to do the grouping and col order within each group. ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [HACKERS] MAX/MIN optimization via rewrite (plus query rewrites generally)
> How are you planning to represent the association between MIN/MAX and > particular index orderings in the system catalogs? Don't we already have that info to decide whether an index handles an "ORDER BY" without a sort node ? Andreas ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PATCHES] [HACKERS] plperl Safe restrictions
Bruce Momjian wrote: Your patch has been added to the PostgreSQL unapplied patches list at: http://momjian.postgresql.org/cgi-bin/pgpatches It will be applied as soon as one of the PostgreSQL committers reviews and approves it. --- Andrew Dunstan wrote: ... The patch also does some other inconsequential tidying of overlong lines, and removes some unnecessary ops in the unsafe case. These are basically cosmetic - the only significant part is replacing this: $PLContainer->permit(':base_math'); with this: $PLContainer->permit(qw[:base_math !:base_io !srand sort sprintf time]); As per previous discussions, please remove "!srand" and "sprintf" if/when applying. cheers andrew ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [HACKERS] Increasing the length of
On Thu, Nov 11, 2004 at 10:20:43AM -0500, Robert Treat wrote: > > I believe the geeky non-helpful answer is to attach to the process with > gdb and do p debug_query_string which I believe will show you said long > running query. Yes, this will work, & I've used it. But of course, you don't actually _know_ it's a long-running query. All you know is that the back end which has that pid has a query now. So you have to connect several times and look a thte query to see this. This strategy is also what pgmonitor and friends do. > Of course that idea lead me to wondering why we couldn't have a function > that could look at a connection (well, either by way of pid or possibly > transaction id) and show the current query being executed. I think that's what's being requested. I've no clue how to do it, though. A -- Andrew Sullivan | [EMAIL PROTECTED] The fact that technology doesn't work is no bar to success in the marketplace. --Philip Greenspun ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [HACKERS] Increasing the length of
On Wed, 2004-11-10 at 17:57, Andrew Sullivan wrote: > On Wed, Nov 10, 2004 at 05:51:01PM -0500, Andrew Sullivan wrote: > > log_statement_after_min_duration (integer) -- which did what Simon > > wants. > > Uh, well, not what Simon wants, of course, but which gave us a useful > capability anyway. I agree that the full-bore profiling for the DBA > would be awful nice. But in its absence, if you could see your > long-running query in the log after a minute, and then go do an > EXPLAIN and realise "uh-oh, that's gonna take 3 days to complete" and > kill it, it would be a big help. > I believe the geeky non-helpful answer is to attach to the process with gdb and do p debug_query_string which I believe will show you said long running query. (Someone who actually hacks C can correct me on that, but I believe I've done it that way before). Of course that idea lead me to wondering why we couldn't have a function that could look at a connection (well, either by way of pid or possibly transaction id) and show the current query being executed. Robert Treat -- Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PATCHES] [HACKERS] plperl Safe restrictions
> >Andrew Dunstan wrote: > > > > > >>... > >> > >>The patch also does some other inconsequential tidying of overlong > >>lines, and removes some unnecessary ops in the unsafe case. These are > >>basically cosmetic - the only significant part is replacing this: > >> > >>$PLContainer->permit(':base_math'); > >> > >>with this: > >> > >>$PLContainer->permit(qw[:base_math !:base_io !srand sort sprintf time]); > >> > >> > >> > > > > As per previous discussions, please remove "!srand" and "sprintf" > if/when applying. OK. -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Vacuum info clarification
While we are at it An explanation of what DETAIL: 0 dead row versions cannot be removed yet There were 1 unused item pointers. mean would be useful. Dave Bruce Momjian wrote: The last two lines of VACUUM VERBOSE are: INFO: free space map: 49 relations, 32 pages stored; 784 total pages needed DETAIL: Allocated FSM size: 1000 relations + 2 pages = 182 kB shared memory. I think "total pages needed" should be "total pages used" or "total pages requested". I am confused about the difference between "stored" and "requested". I read the comments in freespace.c but am still confused. -- Dave Cramer http://www.postgresintl.com 519 939 0336 ICQ#14675561 ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] ltree PostgreSQL Module
John, I miss about UTF-8 :) ltree doesn't supports UTF-8 yet. Oleg On Thu, 11 Nov 2004, John Hansen wrote: Oleg, For example: The DMOZ topic: Top/World/Espa??ol/Pa??ses/M?©xico/Referencia/Bibliotecas/Nacionales select text2ltree(replace('Top/World/Espa??ol/Pa??ses/M?©xico/Referencia/Bibliotecas/Nacionales','/','.')); ERROR: syntax error at position 14 near "?" I've also found that topics contain , as in the DMOZ Topic: Top/Arts/Music/Composition/Composers/F/Fasch,_Johann_Friedrich Kind regards John -Original Message- From: Oleg Bartunov [mailto:[EMAIL PROTECTED] Sent: Thursday, November 11, 2004 8:48 PM To: John Hansen Cc: Teodor Sigaev; Pgsql Hackers Subject: Re: ltree PostgreSQL Module John, On Thu, 11 Nov 2004, John Hansen wrote: Hello folks, It seems that the ltree module does not support the data for which is was (aparantly) created. The DMOZ data is encoded in UTF8, but ISALNUM() does not support unicode characters... In fact, it does not support any database encoding. Is there any plans to change this? it's very difficult to see your problem without any examples ! Many people are lucky users of ltree module, so you probably have some other problem. Any test demonstrating your probelm would be very useful. Also, posting to mailing lists would help other people help you. Kind Regards, John Hansen Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] multiline CSV fields
Patrick B Kelly wrote: On Nov 10, 2004, at 6:10 PM, Andrew Dunstan wrote: The last really isn't an option, because the whole point of CSVs is to play with other programs, and my understanding is that those that understand multiline fields (e.g. Excel) expect them not to be escaped, and do not produce them escaped. Actually, when I try to export a sheet with multi-line cells from excel, it tells me that this feature is incompatible with the CSV format and will not include them in the CSV file. It probably depends on the version. I have just tested with Excel 2000 on a WinXP machine and it both read and wrote these files. cheers andrew ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] MAX/MIN optimization via rewrite (plus query rewrites
On Thu, Nov 11, 2004 at 01:18:05 -0600, "Jim C. Nasby" <[EMAIL PROTECTED]> wrote: > Certainly handling only one case is better than none. I just wanted to > bring up the multiple aggregate scenario. Also, consider that > > SELECT min(a), max(a), min(b), max(c) FROM table > > could be optimized as well (into 4 index scans, assuming a, b, and c all > had indexes). > > I don't think any other aggregates are candidates for optimization right > now, though I guess I could be wrong. Remember that max and min are a number of aggregates, as each datatype which have max and min functions have different ones from those used by other datatypes. I think someone added boolean aggregates for and and or in version 8. If so, those can also use indexes in the same way. ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] MAX/MIN optimization via rewrite (plus query rewrites generally)
On Thu, Nov 11, 2004 at 01:08:39AM -0500, Tom Lane wrote: > Alvaro Herrera <[EMAIL PROTECTED]> writes: > > What about having a new column in pg_aggregate which would point to a > > function that would try to optimize the aggregate's handling? > > I can't get very excited about this, because how would you make a > reasonably stable/narrow API for such a thing? The function as you > propose it would have to know everything about not only the planner's > data representations but the N specific places it would be called from. No, the function would discard all calls except the one it knows how to optimize. The point in having multiple call places is that some aggregates will likely be optimized in some place, and others somewhere else. Most likely, a first patch would include only the call site that would help in optimizing min() and max(). Of course, an aggregate could specify no optimizing function, which would be the current situation, where no aggregate knows how to optimize itself. Re: knowing internal representation, I think this is required anyway; else the optimization would only work on a very limited numer of situations. -- Alvaro Herrera () "A wizard is never late, Frodo Baggins, nor is he early. He arrives precisely when he means to." (Gandalf, en LoTR FoTR) ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
[HACKERS] newbie compile question. please help
Dear Folks, Could someone please tell what I am doing wrong? I made some minor changes to main.c then make then make install when I want to start the postmaster I get The program "postmaster" was found by "/home/gevik/postgres/build2/bin/pg_ctl" but was not the same version as pg_ctl. Check your installation. What am I doing wrong here? Regards, Gevik
Re: [HACKERS] MAX/MIN optimization via rewrite (plus query
On Wed, 2004-11-10 at 22:48, Mark Kirkwood wrote: > Planning for future note: I would like whatever mechanism that is added > for this MAX/MIN stuff to be amenable to more subtle things like > aggregate navigation (see R.Kimball's article > http://www.dbmsmag.com/9608d54.html). > With you on that one... -- Best Regards, Simon Riggs ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] ltree PostgreSQL Module
John, On Thu, 11 Nov 2004, John Hansen wrote: Hello folks, It seems that the ltree module does not support the data for which is was (aparantly) created. The DMOZ data is encoded in UTF8, but ISALNUM() does not support unicode characters... In fact, it does not support any database encoding. Is there any plans to change this? it's very difficult to see your problem without any examples ! Many people are lucky users of ltree module, so you probably have some other problem. Any test demonstrating your probelm would be very useful. Also, posting to mailing lists would help other people help you. Kind Regards, John Hansen Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] MAX/MIN optimization via rewrite (plus query rewrites generally)
"Zeugswetter Andreas DAZ SD" <[EMAIL PROTECTED]> writes: >> How are you planning to represent the association between MIN/MAX and >> particular index orderings in the system catalogs? > Don't we already have that info to decide whether an index handles > an "ORDER BY" without a sort node ? We know how to determine that an index matches an ORDER BY clause. But what has an aggregate called MAX() got to do with ORDER BY? Magic assumptions about operators named "<" are not acceptable answers; there has to be a traceable connection in the catalogs. As a real-world example of why I won't hold still for hard-wiring this: a complex-number data type might have btree opclasses allowing it to be sorted either by real part or by absolute value. One might then define max_real() and max_abs() aggregates on the type. It should be possible to optimize such aggregates the same way as any other max() aggregate. regards, tom lane ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] plperl Safe restrictions
Andrew, can you or someone summarize were we left this issue and your patch? --- Andrew Dunstan wrote: > > > Tom Lane wrote: > > >Andrew Dunstan <[EMAIL PROTECTED]> writes: > > > > > >>The question in my mind is "What are we protecting against?" ISTM it is > >>the use of the pl as a vector to attack the machine and postgres. Does a > >>segfault come into that category? After all, isn't it one of postgres's > >>strengths that we can survive individual backends crashing? > >> > >> > > > >Yeah, but a repeatable segfault certainly is an adequate tool for a > >denial-of-service attack, since it takes out everyone else's sessions > >along with your own. A possibly larger objection is how sure can you be > >that the effects will *only* be a segfault, and not say the ability to > >execute some user-injected machine code. > > > > > > Ok, the release notes for perl 5.005 (which is now pretty ancient) say this: > > "Perl now contains its own highly optimized qsort() routine. The new > qsort() is resistant to inconsistent comparison functions, so Perl's > |sort()| will not provoke coredumps any more when given poorly written > sort subroutines." > > Also, there were some apparent problems with sort routine reentrancy in > perl < 5.6.1 - see > https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=60534. > > I have not found any more recent refs on Google to "sort" causing problems. > > Certainly in my testing it proved totally trivial to crash the backend > with sprintf. > > So I suggest a reasonable position w.r.t. the danger of SEGVs would be > to allow "sort" but disallow sprintf. > > > cheers > > andrew > > > > ---(end of broadcast)--- > TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED] > -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] ltree PostgreSQL Module
Hmm, I don't see the error comes from ltree module, not from 'replace' function. Also, are you sure your postgresql setup is ok (locale issue). On Thu, 11 Nov 2004, John Hansen wrote: Oleg, For example: The DMOZ topic: Top/World/Espa??ol/Pa??ses/M?©xico/Referencia/Bibliotecas/Nacionales select text2ltree(replace('Top/World/Espa??ol/Pa??ses/M?©xico/Referencia/Bibliotecas/Nacionales','/','.')); ERROR: syntax error at position 14 near "?" I've also found that topics contain , as in the DMOZ Topic: Top/Arts/Music/Composition/Composers/F/Fasch,_Johann_Friedrich Kind regards John -Original Message- From: Oleg Bartunov [mailto:[EMAIL PROTECTED] Sent: Thursday, November 11, 2004 8:48 PM To: John Hansen Cc: Teodor Sigaev; Pgsql Hackers Subject: Re: ltree PostgreSQL Module John, On Thu, 11 Nov 2004, John Hansen wrote: Hello folks, It seems that the ltree module does not support the data for which is was (aparantly) created. The DMOZ data is encoded in UTF8, but ISALNUM() does not support unicode characters... In fact, it does not support any database encoding. Is there any plans to change this? it's very difficult to see your problem without any examples ! Many people are lucky users of ltree module, so you probably have some other problem. Any test demonstrating your probelm would be very useful. Also, posting to mailing lists would help other people help you. Kind Regards, John Hansen Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PATCHES] [HACKERS] plperl Safe restrictions
Your patch has been added to the PostgreSQL unapplied patches list at: http://momjian.postgresql.org/cgi-bin/pgpatches It will be applied as soon as one of the PostgreSQL committers reviews and approves it. --- Andrew Dunstan wrote: > > > David Helgason wrote: > > > On 14. okt 2004, at 21:09, Andrew Dunstan wrote: > > > >> It has just been brought to my attention that we are being very > >> restrictive about what we allow to be done in trusted plperl. > >> Basically we allow the :default and :base_math set of operations (run > >> perldoc Opcode or see > >> http://www.perldoc.com/perl5.8.0/lib/Opcode.html for details of what > >> these mean). In particular, we do not allow calls to perl's builtin > >> sort, which is unpleasant, and on reviewing the list it seems to me > >> we could quite reasonably allow access to pack and unpack also. bless > >> and sprintf are also likely candidates for inclusion - I have not > >> finished reviewing the list, and would welcome advice from perl gurus > >> on this. > > > > > > pack and unpack are unfortunately not safe. Very useful, but they > > allow write/read access to random memory. It's really a shame perl > > doesn't have a pragma to make them safe or have safe versions of them. > > > > Bless should be OK, unless sensitive objects are provided to the > > procedures. > > > > A postgres question I don't know the answer to is whether allowing the > > user to trigger a segfault is a security problem. If it isn't, several > > opcodes may probably be allowed, including sort and sprintf. If it is, > > well, you need only follow the perl5-porters list to know that there's > > banal perl structures are continuously being found that will segfault > > perl, some at compile time, other at runtime. > > > OK, based on this and some further thought, I have prepared the attached > patch which does the right thing, I think, both in terms of what we > allow and what we don't. > > First, we tighten security by disallowing access to srand and IO > functions on existing filehandles (other IO ops are already disallowed). > > The we relax the restrictions by allowing access to perl's sort, sprintf > and time routines. I decided against pack/unpack based on the above, and > also decided that I couldn't think of any case where bless would have > any practical use - although that might change later. I'm trying to keep > changes minimal here. I don't believe that "time" carries any > significant security implications, and I think the dangers from "sort" > and "sprintf" are not so great as to disallow them. They might cause a > SEGV in a pathological case, but that doesn't give the user access to > the machine - if they can login to postgres they can probably mount any > number of DOS attacks anyway. > > To answer David's question, the man says this about trusted functions: > "the TRUSTED flag should only be given for languages that do not allow > access to database server internals or the file system". I think the > changes I propose fit in with that statement. > > The patch also does some other inconsequential tidying of overlong > lines, and removes some unnecessary ops in the unsafe case. These are > basically cosmetic - the only significant part is replacing this: > > $PLContainer->permit(':base_math'); > > with this: > > $PLContainer->permit(qw[:base_math !:base_io !srand sort sprintf time]); > > I have tested and it appears to do the right thing, both for the things > excluded and those included. > > cheers > > andrew > > > > > ---(end of broadcast)--- > TIP 5: Have you checked our extensive FAQ? > >http://www.postgresql.org/docs/faqs/FAQ.html -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [HACKERS] ltree PostgreSQL Module
Oleg, For example: The DMOZ topic: Top/World/Español/PaÃses/México/Referencia/Bibliotecas/Nacionales select text2ltree(replace('Top/World/Español/PaÃses/México/Referencia/Bibliotecas/Nacionales','/','.')); ERROR: syntax error at position 14 near "Ã" I've also found that topics contain , as in the DMOZ Topic: Top/Arts/Music/Composition/Composers/F/Fasch,_Johann_Friedrich Kind regards John > -Original Message- > From: Oleg Bartunov [mailto:[EMAIL PROTECTED] > Sent: Thursday, November 11, 2004 8:48 PM > To: John Hansen > Cc: Teodor Sigaev; Pgsql Hackers > Subject: Re: ltree PostgreSQL Module > > John, > > On Thu, 11 Nov 2004, John Hansen wrote: > > > Hello folks, > > > > It seems that the ltree module does not support the data > for which is > > was (aparantly) created. > > > > The DMOZ data is encoded in UTF8, but ISALNUM() does not support > > unicode characters... > > In fact, it does not support any database encoding. > > > > Is there any plans to change this? > > it's very difficult to see your problem without any examples ! > Many people are lucky users of ltree module, so you probably > have some other problem. Any test demonstrating your probelm > would be very useful. > Also, posting to mailing lists would help other people help you. > > > > > Kind Regards, > > > > John Hansen > > > > Regards, > Oleg > _ > Oleg Bartunov, sci.researcher, hostmaster of AstroNet, > Sternberg Astronomical Institute, Moscow University (Russia) > Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/ > phone: +007(095)939-16-83, +007(095)939-23-83 > > ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] multiline CSV fields
On Nov 10, 2004, at 6:10 PM, Andrew Dunstan wrote: The last really isn't an option, because the whole point of CSVs is to play with other programs, and my understanding is that those that understand multiline fields (e.g. Excel) expect them not to be escaped, and do not produce them escaped. Actually, when I try to export a sheet with multi-line cells from excel, it tells me that this feature is incompatible with the CSV format and will not include them in the CSV file. Patrick B. Kelly -- http://patrickbkelly.org ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] MAX/MIN optimization via rewrite (plus query rewrites
Bruno Wolff III <[EMAIL PROTECTED]> writes: > I don't think you should be rewriting queries as much as providing > alternate plans and letting the rest of the optimizer decided which > plan to use. If you just rewrite a query you might lock yourself into > using a poor plan. Moreover, none of these rewritten queries would work properly for select min(foo) from tab group by bar This should still be aware it can use an index on and get much better performance. Well it can't know, but it should be able to. All it should really need to know is that min() only needs a particular subset of the dataset -- namely the first record, as long as the records are provided in a particular order. Also, the same code ought to be able to handle select first(foo) from tab group by bar Which is exactly equivalent to the min() case except that no particular ordering is required. -- greg ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings