Re: ICU for global collation

2022-01-11 Thread Daniel Verite
Julien Rouhaud wrote: > > I guess there's still the possibility of requiring that the ICU db-wide > > collation of the new database does exist in the template database, > > and then the CREATE DATABASE would refer to that collation instead of > > an independent locale string. > > That

Re: ICU for global collation

2022-01-10 Thread Daniel Verite
Julien Rouhaud wrote: > > The lack of these fields overall suggest the idea that when CREATE > > DATABASE is called with a global ICU collation, what if it somehow > > inserted the collation into pg_collation in the new database? > > Then pg_database would just store the collation oid and

Re: ICU for global collation

2022-01-10 Thread Daniel Verite
Peter Eisentraut wrote: > Unlike in the previous patch, where the ICU > collation name was written in datcollate, there is now a third column > (daticucoll), so we can store all three values. I think some users would want their db-wide ICU collation to be case/accent-insensitive.

Re: ICU for global collation

2022-01-07 Thread Daniel Verite
Julien Rouhaud wrote: > If you want a database with an ICU default collation the lc_collate > and lc_ctype should inherit what's in the template database and not > what was provided in the LOCALE I think. You could still probably > overload them in some scenario, but without a list of

Re: Are datcollate/datctype always libc even under --with-icu ?

2021-12-23 Thread Daniel Verite
Chapman Flack wrote: > Next question: the "currently" in that comment suggests that could change, > but is there any present intention to change it, or is this likely to just > be the way it is for the foreseeable future? Some related patches and discussions: * ICU as default collation

Re: very long record lines in expanded psql output

2021-08-05 Thread Daniel Verite
Platon Pronko wrote: > Maybe we can avoid making the header line longer than terminal width > for \pset border 0 and \pset border 1? We already have terminal > width calculated. Please see attached a patch with the proposed > implementation. +1 for doing something against these long

Re: [WIP] UNNEST(REFCURSOR): allowing SELECT to consume data from a REFCURSOR

2021-07-29 Thread Daniel Verite
Hi, Trying the v7a patch, here are a few comments: * SIGSEGV with ON HOLD cursors. Reproducer: declare c cursor with hold for select oid,relname from pg_class order by 1 limit 10; select * from rows_in('c') as x(f1 oid,f2 name); consumes a bit of time, then crashes and generates a 13 GB

Re: COPY table_name (single_column) FROM 'unknown.txt' DELIMITER E'\n'

2021-05-09 Thread Daniel Verite
Darafei "Komяpa" Praliaskouski wrote: > What I would prefer is some new COPY mode like RAW that will just push > whatever it gets on the stdin/input into the cell on the server side. This > way it can be proxied by psql, utilize existing infra for passing streams > and be used in shell

Re: insensitive collations

2021-04-03 Thread Daniel Verite
Jim Finnerty wrote: > SET client_encoding = WIN1252; > [...] > postgres=# SELECT * FROM locations WHERE location LIKE 'Franche-Comt__'; -- > the wildcard is applied byte by byte instead of character by character, so > the 2-byte accented character is matched only by 2 '_'s >location

Re: Calendar support in localization

2021-03-30 Thread Daniel Verite
Surafel Temesgen wrote: > > About intervals, if there were locale-aware functions like > > add_interval(timestamptz, interval [, locale]) returns timestamptz > > or > > sub_timestamp(timestamptz, timestamptz [,locale]) returns interval > > that would use ICU to compute the results

Re: Calendar support in localization

2021-03-29 Thread Daniel Verite
Matthias van de Meent wrote: > The results for the Japanese locale should be "0003 Reiwa" instead of > "0033 Heisei", as the era changed in 2019. ICU releases have since > implemented this and other corrections; this specific change was > implemented in the batched release of ICU versions

Re: Calendar support in localization

2021-03-26 Thread Daniel Verite
Thomas Munro wrote: > Right, so if this is done by trying to extend Daniel Verite's icu_ext > extension (link given earlier) and Robert's idea of a fast-castable > type, I suppose you might want now()::icu_date + '1 month'::internal > to advance you by one Ethiopic month if you have done

Re: insensitive collations

2021-03-25 Thread Daniel Verite
Jim Finnerty wrote: > Currently nondeterministic collations are disabled at the database level. Deterministic ICU collations are also disabled. > The cited reason was because of the lack of LIKE support and because certain > catalog views use LIKE. But the catalogs shouldn't use the

Re: insensitive collations

2021-03-25 Thread Daniel Verite
Jim Finnerty wrote: > For a UTF8 encoded, case-insensitive (nondeterministic), accent-sensitive > ICU > collation, a LIKE predicate can be used with a small transformation of the > predicate, and the pattern can contain multi-byte characters: > > from: > > SELECT * FROM locations WHERE

Re: pgsql: Add libpq pipeline mode support to pgbench

2021-03-17 Thread Daniel Verite
Fabien COELHO wrote: > For consistency with the existing \if … \endif, ISTM that it could have > been named \batch … \endbatch or \pipeline … \endpipeline? "start" mirrors "end". To me, the analogy with \if-\endif is not obvious. Grammatically \if is meant to introduce the expression

Re: popcount

2021-01-18 Thread Daniel Verite
Peter Eisentraut wrote: > + /* > +* ceil(VARBITLEN(ARG1)/BITS_PER_BYTE) > +* done to minimize branches and instructions. > +*/ > > I don't know what that is supposed to mean or why that kind of tuning > would be necessary for a user-callable function. Also, the formula

Re: popcount

2020-12-30 Thread Daniel Verite
David Fetter wrote: +Datum +byteapopcount(PG_FUNCTION_ARGS) +{ + bytea *t1 = PG_GETARG_BYTEA_PP(0); + int len, result; + + len = VARSIZE_ANY_EXHDR(t1); + result = pg_popcount(VARDATA_ANY(t1), len); + + PG_RETURN_INT32(result); +} The input may

Re: PATCH: Batch/pipelining support for libpq

2020-11-23 Thread Daniel Verite
Hi, Here's a new version with the pgbench support included. Best regards, -- Daniel Vérité PostgreSQL-powered mailer: https://www.manitou-mail.org Twitter: @DanielVerite diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml index 9ce32fb39b..2a94f8f6b9 100644 ---

Re: PATCH: Batch/pipelining support for libpq

2020-11-14 Thread Daniel Verite
Alvaro Herrera wrote: > I adapted the test code to our code style. I also removed the "timings" > stuff; I think that's something better left to pgbench. > > (I haven't looked at Daniel's pgbench stuff yet, but I will do that > next.) The patch I posted in [1] was pretty simple, but at

Re: Add header support to text format and matching feature

2020-10-21 Thread Daniel Verite
Rémi Lapeyre wrote: > It looks like this is not in the current commitfest and that Cabot does not > find it. I’m not yet accustomed to the PostgreSQL workflow, should I just > create a new entry in the current commitfest? Yes. Because in the last CommitFest it was marked as "Returned

Re: speed up unicode decomposition and recomposition

2020-10-16 Thread Daniel Verite
John Naylor wrote: > I'd be curious how it compares to ICU now I've made another run of the test in [1] with your v2 patches from this thread against icu_ext built with ICU-67.1. The results show the times in milliseconds to process about 10 million short strings: operation |

Re: Rejecting redundant options in Create Collation

2020-10-02 Thread Daniel Verite
Michael Paquier wrote: > > Hmm ... I think that that is pretty standard behavior for a lot of > > our utility commands. Trying something at random, > > The behavior handling is a bit inconsistent. For example EXPLAIN and > VACUUM don't do that, because their parenthesized grammar got >

Rejecting redundant options in Create Collation

2020-10-01 Thread Daniel Verite
Hi, Currently, it's not an error for CREATE COLLATION to be invoked with options repeated several times. The last (rightmost) value is kept and the others are lost. For instance CREATE COLLATION x (lc_ctype='en_US.UTF8', lc_collate='en_US.UTF8', lc_ctype='C') silently ignores

EDB builds Postgres 13 with an obsolete ICU version

2020-08-03 Thread Daniel Verite
Hi, As a follow-up to bug #16570 [1] and other previous discussions on the mailing-lists, I'm checking out PG13 beta for Windows from: https://www.enterprisedb.com/postgresql-early-experience and it ships with the same obsolete ICU 53 that was used for PG 10,11,12. Besides not having the latest

Re: Add header support to text format and matching feature

2020-07-25 Thread Daniel Verite
Rémi Lapeyre wrote: > Here’s a new version that fix all the issues. Here's a review of v4. The patch improves COPY in two ways: - COPY TO and COPY FROM now accept "HEADER ON" for the TEXT format (previously it was only for CSV) - COPY FROM also accepts "HEADER match" to tell that

Re: Multi-byte character case-folding

2020-07-07 Thread Daniel Verite
Tom Lane wrote: > CREATE TABLE public."myÉclass" ( >f1 text > ); > > If we start to case-fold É, then the only way to access this table will > be by double-quoting its name, which the application probably is not > expecting (else it would have double-quoted in the original CREATE

Re: A bug when use get_bit() function for a long bytea string

2020-04-03 Thread Daniel Verite
movead...@highgo.ca wrote: > >I believe the formula for the upper limit in the 32-bit case is: > > (len <= PG_INT32_MAX / 8) ? (len*8 - 1) : PG_INT32_MAX; > > >These functions could benefit from a comment mentioning that > >they cannot reach the full extent of a bytea, because of the

Re: A bug when use get_bit() function for a long bytea string

2020-04-02 Thread Daniel Verite
movead...@highgo.ca wrote: > A little update for the patch, and patches for all stable avilable. Some comments about the set_bit/get_bit parts. I'm reading long_bytea_string_bug_fix_ver6_pg10.patch, but that applies probably to the other files meant for the existing releases (I think

Re: proposal \gcsv

2020-04-02 Thread Daniel Verite
Alvaro Herrera wrote: > Can we fix that by adding some syntax to allow command aliases? > So you could add to your .psqlrc something like > > \alias \gcsv \pset push all \; \cbuf; \; \pset pop > > where the \cbuf is a hypothetical "function" that expands to the current > query buffer.

Re: proposal \gcsv

2020-04-01 Thread Daniel Verite
Tom Lane wrote: > I could see having a command to copy the current primary formatting > parameters to the alternate area, too. We could have a stack to store parameters before temporary changes, for instance if you want to do one csv export and come back to normal without assuming what

Re: proposal \gcsv

2020-03-28 Thread Daniel Verite
Erik Rijkers wrote: > 2. let the psql command-line option '--csv' honour the value given by > psql -F/--field-separator (it does not do so now) > > or > > 3. add an psql -commandline option: > --csv-field-separator Setting the field separator on the command line is already

Re: A bug when use get_bit() function for a long bytea string

2020-03-27 Thread Daniel Verite
Ashutosh Bapat wrote: > I think we need a similar change in byteaGetBit() and byteaSetBit() > as well. get_bit() and set_bit() as SQL functions take an int4 as the "offset" argument representing the bit number, meaning that the maximum value that can be passed is 2^31-1. But the maximum

Re: Making psql error out on output failures

2020-03-23 Thread Daniel Verite
Peter Eisentraut wrote: > > If there's no more review to do, would you consider moving it to > > Ready for Committer? > > committed Thanks! Best regards, -- Daniel Vérité PostgreSQL-powered mailer: http://www.manitou-mail.org Twitter: @DanielVerite

Re: Unicode normalization SQL functions

2020-03-23 Thread Daniel Verite
Peter Eisentraut wrote: > What is that status of this patch set? I think we have nailed down the > behavior, but there were some concerns about certain performance > characteristics. Do people feel that those are required to be addressed > in this cycle? Not finding any other issue

Re: Making psql error out on output failures

2020-03-06 Thread Daniel Verite
David Zhang wrote: > Thanks for your review, now the new patch with the error message in PG > style is attached. The current status of the patch is "Needs review" at https://commitfest.postgresql.org/27/2400/ If there's no more review to do, would you consider moving it to Ready for

Re: Unicode normalization SQL functions

2020-02-17 Thread Daniel Verite
One nitpick: Around this hunk: - * unicode_normalize_kc - Normalize a Unicode string to NFKC form. + * unicode_normalize - Normalize a Unicode string to the specified form. * * The input is a 0-terminated array of codepoints. * @@ -304,8 +306,10 @@ decompose_code(pg_wchar code, pg_wchar

Re: Unicode normalization SQL functions

2020-02-17 Thread Daniel Verite
Hi, I've checked the v3 patch against the results of the normalization done by ICU [1] on my test data again, and they're identical (as they were with v1; v2 had the bug discussed upthread, now fixed). Concerning execution speed, there's an excessive CPU usage when normalizing into NFC or

Re: Making psql error out on output failures

2020-01-29 Thread Daniel Verite
David Zhang wrote: > > Are you sure? I don't find that redefinition. Besides > > print_aligned_text() also calls putc and puts. > Yes, below is the gdb debug message when psql first time detects the > error "No space left on device". Test case, "postgres=# select > repeat('111',

Re: Making psql error out on output failures

2020-01-28 Thread Daniel Verite
David Zhang wrote: > The error "No space left on device" can be logged by fprintf() which is > redefined as pg_fprintf() when print_aligned_text() is called Are you sure? I don't find that redefinition. Besides print_aligned_text() also calls putc and puts. > Will that be a better

Re: Unicode normalization SQL functions

2020-01-28 Thread Daniel Verite
Peter Eisentraut wrote: > Here is an updated patch set that now also implements the "quick check" > algorithm from UTR #15 for making IS NORMALIZED very fast in many cases, > which I had mentioned earlier in the thread. I found a bug in unicode_is_normalized_quickcheck() which is

Re: making the backend's json parser work in frontend code

2020-01-23 Thread Daniel Verite
Robert Haas wrote: > With the format I proposed, you only have to worry that the > file name might contain a tab character, because in that format, tab > is the delimiter It could be CSV, which has this problem already solved, is easier to parse than JSON, certainly no less popular, and

Re: Making psql error out on output failures

2020-01-20 Thread Daniel Verite
David Zhang wrote: > Yes, I agree with you. For case 2 "select repeat('111', 100) \g > /mnt/ramdisk/file", it can be easily fixed with more accurate error > message similar to pg_dump, one example could be something like below. > But for case 1 "psql -d postgres -At -c "select

Re: psql - add SHOW_ALL_RESULTS option

2020-01-17 Thread Daniel Verite
Tom Lane wrote: > I'm not really holding my breath for that to happen, considering > it would involve fundamental breakage of the wire protocol. > (For example, extended query protocol assumes that Describe > Portal only needs to describe one result set. There might be > more issues, but

Re: psql - add SHOW_ALL_RESULTS option

2020-01-17 Thread Daniel Verite
Alvaro Herrera wrote: > if this patch enables other psql features, it might be a good step > forward. Yes. For instance if the stored procedures support gets improved to produce several result sets, how is psql going to benefit from it while sticking to the old way (PGresult *r =

Re: Making psql error out on output failures

2020-01-16 Thread Daniel Verite
David Zhang wrote: > If I change the error log message like below, where "%m" is used to pass the > value of strerror(errno), "could not write to output file:" is copied from > function "WRITE_ERROR_EXIT". > - pg_log_error("Error printing tuples"); > +

Re: [WIP] UNNEST(REFCURSOR): allowing SELECT to consume data from a REFCURSOR

2020-01-14 Thread Daniel Verite
Dent John wrote: > It’s crashing when it’s checking that the returned tuple matches the > declared return type in rsinfo->setDesc. Seems rsinfo->setDesc gets > overwritten. So I think I have a memory management problem. What is the expected result anyway? A single column with a "record"

Re: Making psql error out on output failures

2020-01-14 Thread Daniel Verite
David Z wrote: > $ psql -d postgres -At -c "select repeat('111', 100)" > > /mnt/ramdisk/file The -A option selects the unaligned output format and -t switches to the "tuples only" mode (no header, no footer). > Test-2: delete the "file", run the command within psql console, > $ rm

Re: [WIP] UNNEST(REFCURSOR): allowing SELECT to consume data from a REFCURSOR

2020-01-10 Thread Daniel Verite
Dent John wrote: > Yes. That’s at least true if unnest(x) is used in the FROM. If it’s used in > the SELECT, actually it can get the performance benefit right away At a quick glance, I don't see it called in the select-list in any of the regression tests. When trying it, it appears to

Re: [WIP] UNNEST(REFCURSOR): allowing SELECT to consume data from a REFCURSOR

2020-01-09 Thread Daniel Verite
Dent John wrote: > I’ve made a revision of this patch. Some comments: * the commitfest app did not extract up the patch from the mail, possibly because it's buried in the MIME structure of the mail (using plain text instead of HTML messages might help with that). The patch has no

Re: Unicode normalization SQL functions

2020-01-06 Thread Daniel Verite
Peter Eisentraut wrote: > Also, there is a way to optimize the "is normalized" test for common > cases, described in UTR #15. For that we'll need an additional data > file from Unicode. In order to simplify that, I would like my patch > "Add support for automatically updating Unicode

Making psql error out on output failures

2019-12-16 Thread Daniel Verite
Hi, When exporting data with psql -c "..." >file or select ... \g file inside psql, post-creation output errors are silently ignored. The problem can be seen easily by creating a small ramdisk and filling it over capacity: $ sudo mount -t tmpfs -o rw,size =1M tmpfs /mnt/ramdisk $ psql -d

Re: updating unaccent.rules for Arabic letters

2019-11-04 Thread Daniel Verite
kerbrose khaled wrote: > I would like to update unaccent.rules file to support Arabic letters. so > could someone help me or tell me how could I add such contribution. I > attached the file including the modifications, only the last 4 lines. The Arabic letters are found in the Unicode

Re: ICU for global collation

2019-11-01 Thread Daniel Verite
Peter Eisentraut wrote: > I looked into this problem. The way to address this would be adding > proper collation support to the text search subsystem. See the TODO > markers in src/backend/tsearch/ts_locale.c for starting points. These > APIs spread out to a lot of places, so it

Re: ICU for global collation

2019-09-17 Thread Daniel Verite
Hi, When trying databases defined with ICU locales, I see that backends that serve such databases seem to have their LC_CTYPE inherited from the environment (as opposed to a per-database fixed value). That's a problem for the backend code that depends on libc functions that themselves depend on

Re: Create collation reporting the ICU locale display name

2019-09-14 Thread Daniel Verite
Tom Lane wrote: > > This output tend to reveal mistakes with tags, which is why I thought > > to expose it as a NOTICE. It addresses the case of a user > > who wouldn't suspect an error, so the "in-your-face" effect is > > intentional. With the function approach, the user must be > >

Re: Create collation reporting the ICU locale display name

2019-09-14 Thread Daniel Verite
Tom Lane wrote: > I think that's a useful function, but it's a different function from > the one first proposed, which was to tell you the properties of a > collation you already installed (which might not be ICU, even). > Perhaps we should have both. The pre-create use case would look

Re: Create collation reporting the ICU locale display name

2019-09-13 Thread Daniel Verite
Michael Paquier wrote: > Or could it make sense to provide a system function which returns a > collation description for at least an ICU-provided one? We could make > use of that in psql for example. If we prefer having a function over the instant feedback effect of the NOTICE, the

Re: Create collation reporting the ICU locale display name

2019-09-12 Thread Daniel Verite
Michael Paquier wrote: > On Wed, Sep 11, 2019 at 04:53:16PM +0200, Daniel Verite wrote: > > I think it would be nice to have CREATE COLLATION report this > > information as feedback in the form of a NOTICE message. > > PFA a simple patch implementing that. > &

Create collation reporting the ICU locale display name

2019-09-11 Thread Daniel Verite
Hi, The 'locale' or 'lc_collate/lc_ctype' argument of an ICU collation may have a complicated syntax, especially with non-deterministic collations, and input mistakes in these names will not necessarily be detected as such by ICU. The "display name" of a locale is a simple way to get

Re: [PATCH] vacuumlo: print the number of large objects going to be removed

2019-09-06 Thread Daniel Verite
Michael Paquier wrote: > Sure. However do we need to introduce this much complication as a > goal for this patch though whose goal is just to provide hints about > the progress of the work done by vacuumlo? Yeah, I went off on a tangent when realizing that ~500 lines of C client-side

Re: psql - add SHOW_ALL_RESULTS option

2019-07-26 Thread Daniel Verite
Fabien COELHO wrote: > sh> /usr/bin/psql > psql (12beta2 ...) > fabien=# \set FETCH_COUNT 2 > fabien=# SELECT 1234 \; SELECT 5432 ; > fabien=# > > same thing with pg 11.4, and probably down to every version of postgres > since the feature was implemented... > > I think that

Re: psql - add SHOW_ALL_RESULTS option

2019-07-25 Thread Daniel Verite
Fabien COELHO wrote: > Attached a "do it always version", which does the necessary refactoring. > There is seldom new code, it is rather moved around, some functions are > created for clarity. Thanks for the update! FYI you forgot to remove that bit: --- a/src/bin/psql/tab-complete.c

Re: psql - add SHOW_ALL_RESULTS option

2019-07-24 Thread Daniel Verite
Fabien COELHO wrote: > >> I'd go further and suggest that there shouldn't be a variable > >> controlling this. All results that come in should be processed, period. > > > > I agree with that. > > I kind of agree as well, but I was pretty sure that someone would complain > if the current

Re: [PATCH] vacuumlo: print the number of large objects going to be removed

2019-07-17 Thread Daniel Verite
Timur Birsh wrote: > Please find attached patch v2. > I fixed some indentation in the variable declaration blocks. The tab width should be 4. Please have a look at https://www.postgresql.org/docs/current/source-format.html It also explains why opportunistic reformatting is futile,

Re: pg_stat_database update stats_reset only by pg_stat_reset

2019-07-11 Thread Daniel Verite
张连壮 wrote: > it reset statistics for a single table and update the column stats_reset of > pg_stat_database. > but i think that stats_reset shoud be database-level statistics, a single > table should not update the column stats_reset. This patch is a current CF entry at

Re: proposal: pg_restore --convert-to-text

2019-07-03 Thread Daniel Verite
Alvaro Herrera wrote: > So you suggest that it should be > > pg_restore: error: one of -d/--dbname, -f/--file and -l/--list must be > specified > ? I'd suggest this minimal fix : (int argc, char **argv) /* Complain if neither -f nor -d was specified (except if dumping TOC) */

Re: Add parallelism and glibc dependent only options to reindexdb

2019-07-01 Thread Daniel Verite
Julien Rouhaud wrote: > > I think you'd be better off to define and document this as "reindex > > only collation-sensitive indexes", without any particular reference > > to a reason why somebody might want to do that. > > We should still document that indexes based on ICU would be

RE: proposal: pg_restore --convert-to-text

2019-06-12 Thread Daniel Verite
José Arthur Benetasso Villanova wrote: > On Thu, 28 Feb 2019, Imai, Yoshikazu wrote: > > > Is there no need to rewrite the Description in the Doc to state we should > > specify either -d or -f option? > > (and also it might be better to write if -l option is given, neither -d nor > >

RE: psql - add SHOW_ALL_RESULTS option

2019-05-15 Thread Daniel Verite
Fabien COELHO wrote: > >> IMHO this new setting should be on by default: few people know about \; so > >> it would not change anything for most, and I do not see why those who use > >> it would not be interested by the results of all the queries they asked > >> for. > > I agree with your

Re: Trouble with FETCH_COUNT and combined queries in psql

2019-04-23 Thread Daniel Verite
Tom Lane wrote: > Keep in mind that a large part of the reason why the \cset patch got > bounced was exactly that its detection of \; was impossibly ugly > and broken. Don't expect another patch using the same logic to > get looked on more favorably. Looking at the end of the discussion

Re: Trouble with FETCH_COUNT and combined queries in psql

2019-04-23 Thread Daniel Verite
Fabien COELHO wrote: > I added some stuff to extract embedded "\;" for pgbench "\cset", which has > been removed though, but it is easy to add back a detection of "\;", and > also to detect select. If the position of the last select is known, the > cursor can be declared in the right

Trouble with FETCH_COUNT and combined queries in psql

2019-04-22 Thread Daniel Verite
Hi, When FETCH_COUNT is set, queries combined in a single request don't work as expected: \set FETCH_COUNT 10 select pg_sleep(2) \; select 1; No result is displayed, the pg_sleep(2) is not run, and no error is shown. That's disconcerting. The sequence that is sent under the hood is: #1

Re: Cleanup/remove/update references to OID column

2019-04-15 Thread Daniel Verite
Andres Freund wrote: > Yes, I was planning to commit that soon-ish. There still seemed > review / newer versions happening, though, so I was thinking of waiting > for a bit longer. You might want to apply this trivial one in the same batch: index 452f307..7cfb67f 100644 ---

Re: PostgreSQL pollutes the file system

2019-04-13 Thread Daniel Verite
Andreas Karlsson wrote: > The Debian packagers already use pg_createcluster for their script which > wraps initdb, and while pg_initdb is a bit misleading (it creates a > cluster rather than a database) it is not that bad. But that renaming wouldn't achieve anything in relation to the

Re: Cleanup/remove/update references to OID column

2019-04-10 Thread Daniel Verite
Justin Pryzby wrote: > Cleanup/remove/update references to OID column... > > ..in wake of 578b229718e8f. Just spotted a couple of other references that need updates: #1. In catalogs.sgml: attnum int2 The number of the column. Ordinary columns

Re: Changes to pg_dump/psql following collation "C" in the catalog

2019-04-06 Thread Daniel Verite
Tom Lane wrote: > > PFA a new version adding the clause for only 12 and up, since the > > previous versions are not concerned, and as you mention, really old > > versions would fail otherwise. > > Pushed with some fiddling with the comments, and changing the collation > names to be

Re: Changes to pg_dump/psql following collation "C" in the catalog

2019-04-05 Thread Daniel Verite
Tom Lane wrote: > Hm, if that's as much as we have to touch, I think there's a good > argument for squeezing it into v12 rather than waiting. The point > here is mostly to avoid a behavior change from pre-v12 Yes. I was mentioning the next CF because ISTM that nowadays non-committers

Re: Changes to pg_dump/psql following collation "C" in the catalog

2019-04-05 Thread Daniel Verite
Chapman Flack wrote: > >> "Daniel Verite" writes: > >>> One consequence of using the "C" collation in the catalog versus > >>> the db collation > > As an intrigued Person Following At Home, I was happy when I found > t

Re: Changes to pg_dump/psql following collation "C" in the catalog

2019-04-04 Thread Daniel Verite
Tom Lane wrote: > "Daniel Verite" writes: > > One consequence of using the "C" collation in the catalog versus > > the db collation is that pg_dump -t with a regexp may not find > > the same tables as before. It happens when these condit

Re: Willing to fix a PQexec() in libpq module

2019-03-19 Thread Daniel Verite
Tom Lane wrote: > Unfortunately, if the default behavior doesn't change, then there's little > argument for doing this at all. The security reasoning behind doing > anything in this area would be to provide an extra measure of protection > against SQL-injection attacks on

Re: insensitive collations

2019-03-07 Thread Daniel Verite
Peter Eisentraut wrote: > The problem is not the syntax but that the older ICU versions don't > support the *functionality* of ks-level2 or colStrength=secondary. If > you try it, you will simply get a normal case-sensitive behavior. My bad, I see now that the "old locale extension

Re: insensitive collations

2019-03-05 Thread Daniel Verite
Peter Eisentraut wrote: > Older ICU versions (<54) don't support all the locale customization > options, so many of my new tests in collate.icu.utf8.sql will fail on > older systems. What should we do about that? Have another extra test file? Maybe stick to the old-style syntax for the

Re: insensitive collations

2019-03-04 Thread Daniel Verite
Peter Eisentraut wrote: [v7-0001-Collations-with-nondeterministic-comparison.patch] +GenericMatchText(const char *s, int slen, const char *p, int plen, Oid collation) { + if (collation && !lc_ctype_is_c(collation) && collation != DEFAULT_COLLATION_OID) + { +pg_locale_tlocale =

Re: Compressed TOAST Slicing

2019-02-20 Thread Daniel Verite
Paul Ramsey wrote: > > text_starts_with(arg1,arg2) in varlena.c does a full decompression > > of arg1 when it could limit itself to the length of the smaller arg2: > > Nice catch, I didn't find that one as it's not user visible, seems to > be only called in spgist (!!) It's also

Re: Compressed TOAST Slicing

2019-02-20 Thread Daniel Verite
Paul Ramsey wrote: > Oddly enough, I couldn't find many/any things that were sensitive to > left-end decompression. The only exception is "LIKE this%" which > clearly would be helped, but unfortunately wouldn't be a quick > drop-in, but a rather major reorganization of the regex handling.

replace_text optimization (StringInfo to varlena)

2019-02-13 Thread Daniel Verite
Hi, replace_text() in varlena.c builds the result in a StringInfo buffer, and finishes by copying it into a freshly allocated varlena structure with cstring_to_text_with_len(), in the same memory context. It looks like that copy step could be avoided by preprending the varlena header to the

Re: backslash-dot quoting in COPY CSV

2019-01-30 Thread Daniel Verite
Bruce Momjian wrote: > > - the end of data could be expressed as a length (in number of lines > > for instance) instead of an in-data marker. > > > > - the end of data could be configurable, as in the MIME structure of > > multipart mail messages, where a part is ended by a "boundary", >

Re: insensitive collations

2019-01-30 Thread Daniel Verite
Peter Eisentraut wrote: > Another patch. +ks key), in order for such such collations to act in a s/such such/such/ + +The pattern matching operators of all three kinds do not support +nondeterministic collations. If required, apply a different collation to +the

Re: Alternative to \copy in psql modelled after \g

2019-01-28 Thread Daniel Verite
Tom Lane wrote: > > Now as far as I can see, there is nothing that \copy to file or program > > can do that COPY TO STDOUT cannot do. > > I don't think there's a way to get the effect of "\copy to pstdout" > (which, IIRC without any caffeine, means write to psql's stdout regardless > of

Re: Alternative to \copy in psql modelled after \g

2019-01-28 Thread Daniel Verite
Tom Lane wrote: > A variant that might or might not be safer is "\g insist on you putting a mark there that shows you intended to read. > > Also, not quite clear what we'd do about copy-from-program. > I think "\g |foo" is definitely confusing for that. "\g foo|" > would be better if it

Re: backslash-dot quoting in COPY CSV

2019-01-28 Thread Daniel Verite
Michael Paquier wrote: > In src/bin/psql/copy.c, handleCopyIn(): > > /* > * This code erroneously assumes '\.' on a line alone > * inside a quoted CSV string terminates the \copy. > * > http://www.postgresql.org/message-id/e1tdnvq-0001ju...@wrigleys.postgresql.org > */ > if (strcmp(buf,

Re: Alternative to \copy in psql modelled after \g

2019-01-28 Thread Daniel Verite
Tom Lane wrote: > OK. I fixed the error-cleanup issue and pushed it. > > The patch applied cleanly back to 9.5, but the code for \g is a good > bit different in 9.4. I didn't have the interest to try to make the > patch work with that, so I just left 9.4 alone. Thanks! Now as far as

Re: Alternative to \copy in psql modelled after \g

2019-01-25 Thread Daniel Verite
Fabien COELHO wrote: > Sure. As there are several bugs (doubtful features) uncovered, a first > patch could fix "COPY ...TO STDOUT \g file", but probably replicate ERROR > current behavior however debatable it is (i.e. your patch without the > ERROR change, which is unrelated to the

Re: backslash-dot quoting in COPY CSV

2019-01-25 Thread Daniel Verite
Bruce Momjian wrote: > but I am able to see the failure using STDIN: > > COPY test FROM STDIN CSV; > Enter data to be copied followed by a newline. > End with a backslash and a period on a line by itself, or an EOF > signal. > "foo > \. >

Re: Alternative to \copy in psql modelled after \g

2019-01-22 Thread Daniel Verite
Fabien COELHO wrote: > > Now if you download data with SELECT or COPY and we can't even > > create the file, how is that a good idea to intentionally have the > > script fail to detect it? What purpose does it satisfy? > > It means that the client knows that the SQL command, and possible

Re: Alternative to \copy in psql modelled after \g

2019-01-21 Thread Daniel Verite
Fabien COELHO wrote: > (1) document ERROR as being muddy, i.e. there has been some error which > may be SQL or possibly client side. Although SQLSTATE would still allow to > know whether an SQL error occured, there is still no client side > expressions, and even if I had moved pgbench

Re: Alternative to \copy in psql modelled after \g

2019-01-21 Thread Daniel Verite
Fabien COELHO wrote: > sql> SELECT 1 \g /BAD > /BAD: Permission denied > > sql> \echo :ERROR > false That seems broken, because it's pointless to leave out a class of errors from ERROR. Presumably the purpose of ERROR is to enable error checking like: \if :ERROR ... error

Re: Alternative to \copy in psql modelled after \g

2019-01-19 Thread Daniel Verite
Fabien COELHO wrote: > > I've also changed handleCopyOut() to return success if it > > could pump the data without writing it out locally for lack of > > an output stream. It seems to make more sense like that. > > I'm hesitating on this one. > > On the one hand the SQL query is

Re: Alternative to \copy in psql modelled after \g

2019-01-18 Thread Daniel Verite
Tom Lane wrote: > I took a quick look at this patch. PFA an updated patch addressing your comments and Fabien's. I've also changed handleCopyOut() to return success if it could pump the data without writing it out locally for lack of an output stream. It seems to make more sense like

Re: insensitive collations

2019-01-16 Thread Daniel Verite
Peter Eisentraut wrote: > > On a table with pre-existing contents, the creation of a unique index > > does not seem to detect the duplicates that are equal per the > > collation and different binary-wise. > > Fixed in the attached updated patch. Check. I've found another issue with

<    1   2   3   >