Re: [GENERAL] PG-10 + ICU and abbreviated keys

2017-11-13 Thread Peter Geoghegan
On Mon, Nov 13, 2017 at 2:05 PM, Andreas Joseph Krogh
<andr...@visena.com> wrote:
> When sorting on text, we're usually doing so using an multi-column index, 
> like for instance "CREATE INDEX xxx ON my_table (lower(name) ASC, created 
> ASC)". Will abbreviated keys help here?

Yes, they'll help with that, even though the leading column might be
low cardinality.

-- 
Peter Geoghegan


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PG-10 + ICU and abbreviated keys

2017-11-13 Thread Peter Geoghegan
On Mon, Nov 13, 2017 at 12:48 PM, Andreas Joseph Krogh
<andr...@visena.com> wrote:
> Thanks.

As the person that worked on abbreviated keys, I'd like to hear about
how you get with this. How much faster is it for you?

I don't usually get to hear about this, because most users don't
notice that anything in particular gets faster, because there are many
performance enhancements added to a release.

-- 
Peter Geoghegan


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PG-10 + ICU and abbreviated keys

2017-11-13 Thread Peter Geoghegan
On Mon, Nov 13, 2017 at 12:40 PM, Peter Geoghegan <p...@bowt.ie> wrote:
>> Do I have to explicitly specify collation when using ORDER by on that column 
>> for index and abbreviated keys to be used?
>
> Only if you didn't define the column with a per-column collation initially.

BTW, if you specifically want to quickly verify whether or not
abbreviated keys were used, you can do that by setting "trace_sort =
on", and possibly setting "client_min_messages = LOG", too.

There should be quite a bit of debug output from that that
specifically mentions abbreviated keys.

-- 
Peter Geoghegan


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PG-10 + ICU and abbreviated keys

2017-11-13 Thread Peter Geoghegan
On Mon, Nov 13, 2017 at 11:48 AM, Andreas Joseph Krogh
<andr...@visena.com> wrote:
> Ok, so I have to explicitly specify like this:
>
> create table test(id serial primary key, name varchar collate "nb_NO" not 
> null);

That doesn't look like an ICU locale. You may mean "nb-NO-x-icu". But
otherwise, yes.

> Will ICU be used here as long as PG is compiled with ICU-suppoert, as the 
> debian-packages are, or do I have to specify collation-provider?

If you did initdb with a version with ICU support, the ICU collations
should be there.

> Do I have to explicitly specify collation when using ORDER by on that column 
> for index and abbreviated keys to be used?

Only if you didn't define the column with a per-column collation initially.

-- 
Peter Geoghegan


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PG-10 + ICU and abbreviated keys

2017-11-13 Thread Peter Geoghegan
On Mon, Nov 13, 2017 at 12:27 AM, Andreas Joseph Krogh
<andr...@visena.com> wrote:
> In PG-10, with ICU enabled, is abbreviated keys now enabled?

Yes. ICU will use abbreviated keys on every platform, including Windows.

> If so, using locale=nb_NO.UTF-8, do I have to use a ICU-specific locale to 
> take advantage of abbreviated keys?

You need to use an ICU collation. It must be a per-column collation,
as you cannot currently use ICU for an entire database. (This
limitation should be removed in the next release or two.)

-- 
Peter Geoghegan


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] gin index trouble

2017-10-30 Thread Peter Geoghegan
On Mon, Oct 30, 2017 at 9:45 AM, Rob Sargent <robjsarg...@gmail.com> wrote:
> Peter, you beat me to the punch.  I was just about to say "Having read the
> referenced message I thought I would add that we never delete from this
> table."  In this particular case it was written to record by record, in a
> previous execution and at the time of the error it was only being read.  (In
> case you've been following, the failed execution would have added ~1M
> "segments", each which references an entry in the gin'd table "probandsets"
> - but like a rookie I'm looking up each probandset(2^16) individually.
> Re-working that NOW.)

It's not surprising that only a SELECT statement could see this
problem. I guess that it's possible that only page deletions used for
the pending list are involved here.

I'm not sure how reliably you can recreate the problem, but if it
doesn't take too long then it would be worth seeing what effect
turning off the FASTUPDATE storage parameter for the GIN index has.
That could prevent the problem from recurring, and would support my
theory about what's up here. (It wouldn't fix the corruption, though.)

Of course, what I'd much prefer is a self-contained test case. But if
you can't manage that, or if reproducing the issue takes hours, then
this simpler experiment might be worthwhile.

-- 
Peter Geoghegan


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] gin index trouble

2017-10-30 Thread Peter Geoghegan
On Fri, Oct 27, 2017 at 3:15 PM, Rob Sargent <rsarg...@xmission.com> wrote:
> I’ve hit this same message
>
> Caused by: org.postgresql.util.PSQLException: ERROR: right sibling of GIN
> page is of different type
>
> in a couple of contexts and I’m starting to get worried.
>
> I’ve rebuilt the index, but will that help?
> Is there a way to see what the ‘different type’ is?
> Is it caught/clean-up by vacuum analyse or some such?

Is there a lot of churn on this table? Do you either heavily update or
heavily delete rows in the table? Does vacuum tend to run on the table
rather frequently?

-- 
Peter Geoghegan


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] gin index trouble

2017-10-30 Thread Peter Geoghegan
On Mon, Oct 30, 2017 at 7:35 AM, Tom Lane <t...@sss.pgh.pa.us> wrote:
> Rob Sargent <rsarg...@xmission.com> writes:
>> I’ve hit this same message
>> Caused by: org.postgresql.util.PSQLException: ERROR: right sibling of GIN 
>> page is of different type
>> in a couple of contexts and I’m starting to get worried.
>
> If you can make a test case that (eventually) hits that, we'd be
> interested to see it ...

I suspect that this is the 9.6 bug that I described on that recent
-bugs thread [1]. It's just another symptom of the same problem.

It's certainly true that we saw a mix of undetectable
deadlocks/lock-ups (as seen on that -bugs thread) and corruption (as
seen on this thread) before commit e2c79e14 tried to address those
problems. Jeff Janes reported both symptoms in the thread leading up
to that commit during the beta period for 9.6. My guess is that that
commit was insufficient, and that we now continue to see the same mix
of symptoms for what is essentially the same bug.

[1] 
https://postgr.es/m/CAH2-WzmtLXbs8+c19t1T=rj0kyp7vk9q8hqjulgdldvmuee...@mail.gmail.com
-- 
Peter Geoghegan


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] table corruption

2017-10-23 Thread Peter Geoghegan
> Nice to see it included in 10!
> https://www.postgresql.org/docs/10/static/amcheck.html

The reason that I pointed to the Github version rather than the
contrib version is that only the Github version currently has the
"heapallindexed" check. That seems likely to be by far the most
important check here.

-- 
Peter Geoghegan


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] table corruption

2017-10-23 Thread Peter Geoghegan
On Mon, Oct 23, 2017 at 7:44 AM, Peter Hunčár <hu...@hunci.sk> wrote:
> I know that zero_damaged_pages and vacuum (or restore the table from backup)
> will help, but I want to ask if there is a way to identify affected
> rows/datafiles, so we can 'fix' only the affected data using the
> backup/source data, instead of restoring the whole table?

You might find the latest version of amcheck helpful here:
https://github.com/petergeoghegan/amcheck

It's not really written with repair in mind, since that's such a can
of worms, but it might still help you.

-- 
Peter Geoghegan


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] could not fdatasync log file: Input/output error

2017-10-17 Thread Peter Geoghegan
On Tue, Oct 17, 2017 at 7:13 PM, Michael Paquier
<michael.paqu...@gmail.com> wrote:
> Note that Peter has also worked on provising Debian packages for the
> utility down to 9.4 if I recall correctly, which is nice, but if you
> want the heap checks you will need to compile things by youself. We
> are currently under way to get something improved in Postgres 11. I
> should actually spare some time to look more at the patch concepts..

I'm probably going to commit the new version in the next couple of
days, and package the heap check into a new release. It might take a
few more days for the community apt and yum repos to get packages for
the new version (both have 1.0 versions as of right now, though).

-- 
Peter Geoghegan


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Merge - Insert Select

2017-10-13 Thread Peter Geoghegan
On Fri, Oct 13, 2017 at 9:39 AM, Susan Hurst
<susan.hu...@brookhurstdata.com> wrote:
> Does postgres have a MERGE statement ala Oracle?

No.

> I have found examples online for INSERT...ON CONFLICT DO NOTHING, but all of
> those examples use VALUES statements, which is not the scenario that I have.

You can use SELECT instead of VALUES with ON CONFLICT DO NOTHING/ON
CONFLICT DO UPDATE. They don't impose any restriction on the INSERT
statement at all, unlike MERGE, which is fussy about the use of
subqueries.

-- 
Peter Geoghegan


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] startup process stuck in recovery

2017-10-09 Thread Peter Geoghegan
On Mon, Oct 9, 2017 at 12:08 PM, Christophe Pettus <x...@thebuild.com> wrote:
> Suggestions on further diagnosis?

What's the hot_standy_feedback setting? How about
max_standby_archive_delay/max_standby_streaming_delay?


-- 
Peter Geoghegan


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] startup process stuck in recovery

2017-10-09 Thread Peter Geoghegan
On Mon, Oct 9, 2017 at 5:30 PM, Tom Lane <t...@sss.pgh.pa.us> wrote:
> and did not see any untoward behavior, at least not till I got to enough
> temp tables to overrun the master's shared lock table, and even then it
> cleaned up fine.  At no point was the standby process consuming anywhere
> near as much CPU as the master's backend.
>
> What am I missing to reproduce the problem?

Just a guess, but do you disable autovacuum on your dev machine? (I know I do.)

It's possible that this is relevant:
https://postgr.es/m/CAB-EU3RawZx8-OzMfvswFf6z+Y7GOZf03TZ=bez+pbqx+a4...@mail.gmail.com

-- 
Peter Geoghegan


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Corrupt index

2017-08-15 Thread Peter Geoghegan
On Tue, Aug 15, 2017 at 12:24 PM, Daryl Stultz
<daryl.stu...@opentempo.com> wrote:
> We have a recurring problem that we can get out of with "reindex table
> mytable". Basically we can't find a particular record or two using the
> primary key and after reindexing we can. There does seem to be a pattern
> related to application business processes but we are at a loss as to how
> this could happen.

You've given no details at all. What business pattern? What does the
index and table look like?

-- 
Peter Geoghegan


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] upsert and update filtering

2017-07-31 Thread Peter Geoghegan

armand pirvu <armand.pi...@gmail.com> wrote:

But what if in the conflict situation I want to performa the update ONLY if the 
record is different. The update seems to happen no matter what
In other words is there anyway I can filter the update to happen (based on the 
sample date) only for 112 since col2 is different ?


That's quite possible. An ON CONFLICT's UPDATE accepts a WHERE clause,
which can reference both existing and excluded tuples. That WHERE clause
can back out of the UPDATE based on whatever criteria you like.

--
Peter Geoghegan


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] upsert: is there a shortcut?

2017-07-28 Thread Peter Geoghegan

Daniele Varrazzo <daniele.varra...@gmail.com> wrote:

Is there a way to avoid replicating the list of fields and use instead
something like (new.*) = (excluded.*) as one could do in a trigger?
(that would also imply an (id = excluded.id but it seems harmless).


This is certainly something that I've seen requests for before. I tend
to think that using such a feature would be a bit like using "SELECT *"
in production: something that provides an immediate convenience, but
creates unforeseen problems.

As an example, imagine if someone adds an "inserted_at" column, which
has "now()" as its default value. Today, a user can be pretty confident
that no existing or future query is going to change that itself, because
in order for that to happen the query would have to be written with the
explicit intention of updating "inserted_at". That property would go
away with the feature you describe. Subtleties like this could easily be
missed.

--
Peter Geoghegan


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Indexes being ignored after upgrade to 9.5

2017-07-26 Thread Peter Geoghegan
On Wed, Jul 26, 2017 at 2:05 PM, Peter Geoghegan <p...@bowt.ie> wrote:
> On Tue, Jul 25, 2017 at 10:34 PM, Nick Brennan <nbrenna...@gmail.com> wrote:
>> We've added duplicate indexes and analyzing, however the new indexes are
>> still ignored unless we force using enable_seqscan=no or reduce
>> random_page_cost to 2. The query response times using the new indexes are
>> still as slow when we do this. Checking pg_stat_user_indexes the number of
>> tuples returned per idx_scan is far greater after the upgrade than before.
>> All indexes show valid in pg_indexes.

I assume that you mean that pg_stat_user_indexes.idx_tup_read is a lot
higher than before, in proportion to pg_stat_user_indexes.idx_scan.
What about the ratio between pg_stat_user_indexes.idx_tup_read and
pg_stat_user_indexes.idx_tup_fetch? How much has that changed by?

-- 
Peter Geoghegan


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Strange case of database bloat

2017-07-26 Thread Peter Geoghegan
On Tue, Jul 4, 2017 at 10:18 PM, Chris Travers <chris.trav...@gmail.com> wrote:
> First, I haven't seen major problems of database bloat in a long time which
> is why I find this case strange.  I wanted to ask here what may be causing
> it.
>
> Problem:
> ==
> Database is in the 100GB to 200GB size range, running on btrfs (not my
> choice) with nodatacow enabled (which I set up to fix a performance issue).
> The workload is a very heavy batch-update workload.
>
> The database bloats linearly.  I have measured this on one  table (of 149M
> rows).
>
> After vacuum full this table is (including indexes): 17GB
> Every 24 hrs, seems to add its original space in size to the file system
> +/-.
>
> Bloat seems to be affecting both indexes and underlying tables.
>
> Vacuum verbose does not indicate a disproportionate number of rows being
> unremovable.  So autovacuum is keeping up without too much difficulty.

That is odd. I find the linear increase in bloat alarming.

I found this thread in the context of investigating a possible
performance regression in 9.5 [1]. It's far from clear that that's
actually what's up here, but it's worth considering as a possibility.
Some questions about that:

Can you show the schema of at least one of the tables that are most
affected, its indexes, etc?

Are all indexes bloated equally?

Do you notice that unique indexes are not as badly affected as other
indexes? How does it break down, in terms of how much each individual
index grows in size?

You say that the problem is with both indexes and tables. How much of
this is table bloat, and how much is index bloat?

Thanks

[1] 
https://postgr.es/m/CAH2-Wz=sfakvmv1x9jh19ej8am8tzn9f-yecips9hrrrqss...@mail.gmail.com
-- 
Peter Geoghegan


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Indexes being ignored after upgrade to 9.5

2017-07-26 Thread Peter Geoghegan
On Tue, Jul 25, 2017 at 10:34 PM, Nick Brennan <nbrenna...@gmail.com> wrote:
> We've added duplicate indexes and analyzing, however the new indexes are
> still ignored unless we force using enable_seqscan=no or reduce
> random_page_cost to 2. The query response times using the new indexes are
> still as slow when we do this. Checking pg_stat_user_indexes the number of
> tuples returned per idx_scan is far greater after the upgrade than before.
> All indexes show valid in pg_indexes.
>
>
> We have tried increasing effective_cache_size but no effect (the queries
> appear to go slower). The DB is 24x7 so we cannot reindex the tables/
> partitions.
>
>
> Can anyone suggest why this would be happening?

Are the indexes bloated? Are they larger than before, as indicated by
psql's \di+ or similar? Did you notice that this happened immediately,
or did it take a while? Are these unique indexes or not? Do you have a
workload with many UPDATEs?

I ask all these questions because I think it's possible that this is
explained by a regression in 9.5's handling of index bloat, described
here:

http://postgr.es/m/CAH2-Wz=sfakvmv1x9jh19ej8am8tzn9f-yecips9hrrrqss...@mail.gmail.com

-- 
Peter Geoghegan


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Imperative Query Languages

2017-07-05 Thread Peter Geoghegan
On Wed, Jul 5, 2017 at 7:02 AM, Merlin Moncure <mmonc...@gmail.com> wrote:
> Downthread, Tom mentions CODASYL, etc, but could have mentioned the
> big daddy, ISAM, and all it's various flavors.  Virtually all business
> code was written using it (and much of it still is) leading into the
> SQL era.   Code generally looked exactly like your example, except it
> was much more stupid looking being written in (generally) COBOL, and
> added in error handling, which is where the technique tends to break
> down.  SQL came about because some particularly smart people realized
> that programmers were writing the same boiler plate code over and over
> again and that perhaps access to data could be generalized and focused
> down to the real problem being solved.   This fortunately came about
> before "enterprise developers" and "enterprise tool vendors" were as
> organized as they are today and so was able to germinate into
> something incredibly useful...

To state the very obvious: If you assume for the sake of discussion
that the programmer of a hypothetical imperative query language is
infinitely capable and dedicated, and so is at least as capable as any
possible query optimizer, the optimizer still comes out ahead, because
it is capable of producing a different, better query plan as the
underlying data changes. Of course, it's also true that it's very hard
to beat the query optimizer under ideal conditions.

However, the OP seemed to be describing something that maps imperative
code to a declarative SQL query or something equivalent, which isn't
quite the same thing. The declarative nature of SQL feels restrictive
or at least unfamiliar to many programmers. What is often somehow
missed is that it's restrictive in a way that's consistent with how
the relational model is supposed to work. It seems hard to some
programmers because you have to formulate your query in terms of an
outcome, not in terms of a series of steps that can be added to
iteratively, as a snippet of code is written. It's very different to
something like bash, because it requires a little bit of up-front,
deliberate mental effort. And, because performance with many different
possible outputs matters rather a lot.

IMV, what the OP describes wouldn't work well because it would
superficially *appear* to not be restrictive in the way that some
people dislike, but actually would be just as restrictive. The only
way you could write it is by imagining what the SQL it produced looks
like. Or, if I've misunderstood his point, then it wouldn't work
because of the same reason that things like CODASYL are obsolete.

Some developers don't like SQL because they don't have a good
intuition for how the relational model works. While SQL does have some
cruft -- incidental complexity that's a legacy of the past -- any
language that corrected SQL's shortcomings wouldn't be all that
different to SQL, and so wouldn't help with this general problem. Quel
wasn't successful because it was only somewhat better than SQL was at
the time.

This is a conversation that I had a few times when I worked for
Heroku, with coworkers that weren't on the database team. They asked
similar questions. It took me a while to piece this together.

-- 
Peter Geoghegan


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] insert on conflict

2017-06-27 Thread Peter Geoghegan
On Tue, Jun 27, 2017 at 12:40 PM, armand pirvu <armand.pi...@gmail.com> wrote:
> so how is it working in fact ? Isn't it working like looping in the
> IVEE.dim_company  and for each company_id if the record does have a
> correspondent in csischema.dim_company then update csischema.dim_company set
> company_name = EXCLUDED.company_name  where company_id=...  ? If so isn't it
> supposed to use the PK for each company_id ? Or is it more like building a
> whole list from IVEE.dim_company and treat like a join ? Just trying to
> understand

The processing here has to happen a tuple at a time. That's because
the index structure itself is the only authoritative source of truth
about whether or not there is a possible conflict. An MVCC snapshot
isn't good enough, because it sees a consistent view of things, not
the true physical reality of what exists or does not exist in the
index.

What you end up with here is a weird nested loop join, you might say.
The implementation couldn't do it any other way (this could never
behave more like a merge join), because we must eagerly check for
conflicts right as we insert (our insert would be registered by
*other* inserters/upserters as a conflict). If the implementation did
ever do it that way, it would break the important UPSERT guarantees
around concurrency.

MERGE does this in other systems, which is okay for those other
systems because MERGE makes no special promises about concurrency
(e.g., you can get a unique violation in the joined-on column with
MERGE). But, MERGE would be faster for bulk loading, which is what
MERGE is good for.

-- 
Peter Geoghegan


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] effective_io_concurrency increasing

2017-06-19 Thread Peter Geoghegan
On Mon, Jun 19, 2017 at 4:51 PM, Peter Geoghegan <p...@bowt.ie> wrote:
> This would make only the first lookup for each distinct value on the
> outer side actually do an index scan on the inner side. I can imagine
> the optimization saving certain queries from consuming a lot of memory
> bandwidth, as well as saving them from pinning and locking the same
> buffers repeatedly.

Apparently this is sometimes called block nested loop join, and MySQL
has had it for a while now:

https://en.wikipedia.org/wiki/Block_nested_loop

It doesn't necessarily require that the outer side input be sorted,
because you might end up using a hash table, etc.

-- 
Peter Geoghegan


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] effective_io_concurrency increasing

2017-06-19 Thread Peter Geoghegan
On Mon, Jun 19, 2017 at 4:35 PM, Andres Freund <and...@anarazel.de> wrote:
>> I think that this is the way index scan prefetch is normally
>> implemented. Index scans will on average have a much more random
>> access pattern than what is typical for bitmap heap scans, making this
>> optimization more compelling, so hopefully someone will get around to
>> this.
>
> I think for index based merge and nestloop joins, it'd be hugely
> beneficial to do prefetching on the index, but more importantly on the
> heap level.  Not entirely trivial to do however.

Speaking of nestloop join, and on a similar note, we could do some
caching on the inner side of a nestloop join.

We already track if the outer side access path of a nestloop join
preserves sort order within the optimizer. It might not be that hard
to teach the optimizer to generate a plan where, when we know that
this has happened, and we know that the outer side is not unique, the
final plan hints to the executor to opportunistically cache every
lookup on the inner side.

This would make only the first lookup for each distinct value on the
outer side actually do an index scan on the inner side. I can imagine
the optimization saving certain queries from consuming a lot of memory
bandwidth, as well as saving them from pinning and locking the same
buffers repeatedly.

-- 
Peter Geoghegan


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] effective_io_concurrency increasing

2017-06-19 Thread Peter Geoghegan
On Mon, Jun 19, 2017 at 3:25 PM, Alvaro Herrera
<alvhe...@2ndquadrant.com> wrote:
> Surely you could prefetch all the heap pages pointed to by index items
> in the current leaf index page ...

I'm sure that you could do that too. I'm not sure how valuable each
prefetching optimization is.

I can imagine prefetching heap pages mattering a lot less for a
primary key index, where there is a strong preexisting correlation
between physical and logical order, while also mattering a lot more
than what I describe in other cases. I suppose that you need both.

-- 
Peter Geoghegan


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] effective_io_concurrency increasing

2017-06-19 Thread Peter Geoghegan
On Mon, Jun 19, 2017 at 8:36 AM, Jeff Janes <jeff.ja...@gmail.com> wrote:
> Unfortunately, it is only implemented in very narrow circumstances.  You
> have to be doing bitmap index scans of many widely scattered rows to make it
> useful.  I don't think that this is all that common of a situation.  The
> problem is that at every point in the scan, it has to be possible to know
> what data block it is going to want N iterations in the future, so you can
> inform the kernel to pre-fetch it.  That is only easy to know for bitmap
> scans.

I think that you could prefetch in index scans by using the
pointers/downlinks in the immediate parent page of the leaf page that
the index scan currently pins. The sibling pointer in the leaf itself
is no good for this, because there is only one block to prefetch
available at a time.

I think that this is the way index scan prefetch is normally
implemented. Index scans will on average have a much more random
access pattern than what is typical for bitmap heap scans, making this
optimization more compelling, so hopefully someone will get around to
this.

-- 
Peter Geoghegan


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] all serial type was changed to 1

2017-05-01 Thread Peter Geoghegan
On Sun, Apr 30, 2017 at 10:51 PM, Max Wang <mw...@1080agile.com> wrote:
> We have a PostgreSQL database. There are 26 tables and we use serial type as
> primary key.  We had a insert error as “duplicate key value violates unique
> constraint, DETAIL:  Key (id)=(1) already exists.” one weeks ago. I checked
> and found all tables’ id were reset to 1.

I've heard of this happening before. I never determined what the cause was.


-- 
Peter Geoghegan

VMware vCenter Server
https://www.vmware.com/


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] UPDATE ... ON CONFLICT DO NOTHING

2017-03-14 Thread Peter Geoghegan
On Tue, Mar 14, 2017 at 4:55 AM, Alexander Farber
<alexander.far...@gmail.com> wrote:
> I wish there would be possibility to add ON CONFLICT DO NOTHING /* and drop
> the review */ to the both UPDATE's above, but there is no such thing
> described at https://www.postgresql.org/docs/9.5/static/sql-update.html

Heikki wanted to keep the door open to a feature like this around the
time that speculative insertion (the underlying mechanism used by
UPSERT) went in. I eventually convinced him that it is drastically
more complicated than it appears. This is essentially because it's
much harder to back out of an UPDATE than it is to back out of an
INSERT. If you're really interested, search through the -hackers
archives from around April of 2015.

-- 
Peter Geoghegan


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Avoid sorting when doing an array_agg

2016-12-04 Thread Peter Geoghegan
On Sun, Dec 4, 2016 at 4:09 PM, Tom Lane <t...@sss.pgh.pa.us> wrote:
> Of course, we would also have to teach cost_sort or someplace near there
> that non-C sorting is much more expensive than C-collation sorting.  Not
> sure about exactly how to set that up without it being a kluge.

We've talked about that before, in the context of parallel query. At
the 2014 developer meeting, IIRC.

> A related problem is that if you have "GROUP BY x,y" and no particular
> ORDER BY requirement, you could sort by either x,y or y,x before the
> GroupAgg.  This would matter if, say, there was an index matching one
> but not the other.  Right now we're very stupid and only consider x,y,
> but if there were room to consider more than one set of target pathkeys
> it would be fairly simple to make that better.

That sounds valuable, especially because it seems natural to make the
leading group-on var the least selective within a GROUP BY; having a
matching index that you can thereby use might be less common than that
in practice, unless and until the partial sort patch is committed.

I will tend to write "GROUP BY country, province, city" -- never
"GROUP BY city, province, country". I speak a language that is written
left-to-right, but it would be the opposite way around in both
directions if I spoke a language written right-to-left, I bet. Same
difference. This might be a very prevalent habit. In general, a
tuplesort will be faster with a high cardinality leading attribute, so
this habit works against tuplesort. (Assuming a leading attribute of
pass-by-value type, or with abbreviated key support.)

-- 
Peter Geoghegan


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Avoid sorting when doing an array_agg

2016-12-04 Thread Peter Geoghegan
On Sat, Dec 3, 2016 at 5:20 PM, Tomas Vondra
<tomas.von...@2ndquadrant.com> wrote:
> So the sort is probably slow because of CPU, as it compares strings. In
> some locales that may be very expensive - not sure which locale is used
> in this case, as it was not mentioned.

I wonder what it would take to teach the optimizer to consider the
possibility of a "collation strength reduction". In other words, for
aggregates that perform a sort (or for aggregates that rely on the
presence of a sort node without there being some other dependency on
the sort node), it should be possible for the optimizer to determine
that it would be just fine to use the C locale, since the user isn't
entitled to assume anything about the exact sort order. There are of
course cases where this can make a huge difference.

-- 
Peter Geoghegan


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] SERIALIZABLE and INSERTs with multiple VALUES

2016-10-26 Thread Peter Geoghegan
On Mon, Oct 24, 2016 at 8:07 AM, Kevin Grittner <kgri...@gmail.com> wrote:
> My initial thought is that since reducing the false positive rate
> would only help when there was a high rate of conflicts under the
> existing patch, and it would add code complexity and cost for the
> case where conflict rate is low, that we might want to just leave
> the current fix and see whether there are complaints from the field
> about the false positive rate.
>
> Reducing the rate of false positive serialization failures is a
> worthy goal, but it's gotta make sense from a cost/benefit
> perspective.

What are your thoughts on the back-and-forth between myself and Tom
concerning predicate locks within heap_fetch_tuple() path last
weekend? I now think that there might be an outstanding concern about
ON CONFLICT DO NOTHING + SSI here.


-- 
Peter Geoghegan


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] SERIALIZABLE and INSERTs with multiple VALUES

2016-10-13 Thread Peter Geoghegan
On Thu, Oct 13, 2016 at 6:19 AM, Kevin Grittner <kgri...@gmail.com> wrote:
>> I was under the impression that false positives of this kind are
>> allowed by SSI. Why focus on this false positive scenario in
>> particular?
>
> Every situation that generates a false positive hurts performance;
> we went to great lengths to minimize those cases.  In addition, we
> made sure that at the point that a serialization failure is
> returned, that retrying the transaction from the start could not
> fail on the same combination of transactions, by ensuring that at
> least one transaction in the set had successfully committed, and
> that it was a transaction which had done writes.  To generate a
> serialization failure on a single transaction has to be considered
> a bug, because a retry *CAN NOT SUCCEED*!  This is likely to break
> many frameworks designed to work with serializable transactions.

It sounds like you're talking about the original complaint about a
multi-value INSERT. It took me a minute to decide that that's probably
what you meant, because everyone already agrees that that isn't okay
-- you don't need to convince me.

We must still determine if a fix along the lines of the one proposed
by Thomas is basically acceptable (that is, that it does not clearly
break any documented guarantees, even if it is overly strict).
Separately, I'd be interested in seeing how specifically we could do
better with the patch that you have in the works for this.

In general, I see value in reducing false positives, but I don't
understand why your concern here isn't just about preferring to keep
them to a minimum (doing our best). In other words, I don't understand
why these false positives are special, and I'm still not even clear on
whether you are actually arguing that they are special. (Except, of
course, the multi-value case -- that's clearly not okay.)

So, with the fix proposed by Thomas applied, will there be any
remaining false positives that are qualitatively different to existing
false positive cases? And, if so, how?

-- 
Peter Geoghegan


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] SERIALIZABLE and INSERTs with multiple VALUES

2016-10-12 Thread Peter Geoghegan
On Wed, Oct 12, 2016 at 6:06 PM, Thomas Munro
<thomas.mu...@enterprisedb.com> wrote:
> But yeah, the existing code raises false positive serialization
> failures under SERIALIZABLE, and that's visible in the isolation test
> I posted: there is actually a serial order of those transactions with
> the same result.

I was under the impression that false positives of this kind are
allowed by SSI. Why focus on this false positive scenario in
particular?

-- 
Peter Geoghegan


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] SERIALIZABLE and INSERTs with multiple VALUES

2016-10-12 Thread Peter Geoghegan
On Wed, Oct 12, 2016 at 2:06 PM, Kevin Grittner <kgri...@gmail.com> wrote:
> If the "proper" fix is impossible (or just too freaking ugly) we
> might fall back on the fix Thomas suggested, but I would like to
> take advantage of the "special properties" of the INSERT/ON
> CONFLICT DO NOTHING code to avoid false positives where we can.

Do you intend to propose a patch to do that?


-- 
Peter Geoghegan


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] SERIALIZABLE and INSERTs with multiple VALUES

2016-10-12 Thread Peter Geoghegan
On Wed, Oct 12, 2016 at 1:41 PM, Kevin Grittner <kgri...@gmail.com> wrote:
> Aren't these two completely separate and independent bugs?

Technically they are, but they are both isolated to the same small
function. Surely it's better to fix them both at once?

-- 
Peter Geoghegan


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] SERIALIZABLE and INSERTs with multiple VALUES

2016-10-12 Thread Peter Geoghegan
On Wed, Oct 12, 2016 at 1:05 PM, Thomas Munro
<thomas.mu...@enterprisedb.com> wrote:
> Here's a patch that shows one way to fix it.  I think it does make
> sense to change this, because otherwise automatic
> retry-on-serialization-failure strategies will be befuddle by this
> doomed transaction.  And as you and Vitaly have said, there is
> literally no concurrent update.

I think that you have the right idea, but we still need to fix that
buffer lock bug I mentioned...


-- 
Peter Geoghegan


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] SERIALIZABLE and INSERTs with multiple VALUES

2016-10-12 Thread Peter Geoghegan
On Wed, Oct 12, 2016 at 12:45 PM, Kevin Grittner <kgri...@gmail.com> wrote:
>> The test in ExecCheckHeapTupleVisible() seems wrong to me.  It's
>> not immediately obvious what the proper fix is.

(prune old e-mail address from cc list)

I agree that the multi-value case is a bug.

> To identify what cases ExecCheckHeapTupleVisible() was meant to
> cover I commented out the body of the function to see which
> regression tests failed.  None did.  The failures shown on this
> thread are fixed by doing so.  If there is really a need for this
> function, speak up now and provide a test case showing what is
> broken without it; otherwise if I can't find some justification for
> this function I will rip it (and the calls to it) out of the code.
> If you do have some test case showing what breaks without the
> function, let's get it added to the regression tests!

I am independently working on a bug to fix to
ExecCheckHeapTupleVisible(), concerning the fact that
HeapTupleSatisfiesMVCC() can be called without even a shared buffer
lock held (only a buffer pin) [1]. So, anything that we do here should
probably take care of that, while we're at it.

I think that it should be pretty obvious to you why the check exists
at all, Kevin. It exists because it would be improper to decide to
take the DO NOTHING path on the basis of some other committed tuple
existing that is not visible to the original MVCC snapshot, at higher
isolation levels. There is a similar consideration for DO UPDATE. I'm
slightly surprised that you're contemplating just ripping the check
out. Did I miss something?

[1] https://www.postgresql.org/message-id/57ee93c8.8080...@postgrespro.ru
-- 
Peter Geoghegan


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] What limits Postgres performance when the whole database lives in cache?

2016-09-02 Thread Peter Geoghegan
On Fri, Sep 2, 2016 at 8:36 PM, dandl <da...@andl.org> wrote:
> The paper is substantially in agreement with the presentation I quoted. If 
> there are differences in detail, they certainly don't dominate his argument.

My point is that the paper is rather light on details of the kind that
are really important. And, that it's noteworthy that Stonebraker has
in the past, during presentations, emphasized the buffer lock
crabbing/latch coupling thing *at length*, even though it's a totally
solved problem.

It's also true that Postgres has become vastly more scalable in the
past few years due to optimization that doesn't change the fundamental
nature of the system at all, so it's very easy to imagine individual
differences being more important than differences between major
classes of system.

Those are facts. You may take from them what you will.

> IMO your claim is far weaker. What specifically do you say is wrong about his 
> current claims, and on what facts to you base it?

I'm not the one making overarching conclusions. I'm not trying to
convince you of anything.

-- 
Peter Geoghegan


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] What limits Postgres performance when the whole database lives in cache?

2016-09-02 Thread Peter Geoghegan
On Fri, Sep 2, 2016 at 10:32 AM, Andres Freund <and...@anarazel.de> wrote:
>
>> > I wondered if there are any figures or measurements on Postgres performance
>> > in this ‘enough memory’ environment to support or contest this point of
>> > view?
>
> I don't think that's really answerable without individual use-cases in
> mind.  Answering that question for analytics, operational, ... workloads
> is going to look different, and the overheads are elsewhere.
>
> I personally think that each implementations restrictions are more
> likely to be an issue than anything "fundamental".

+1

At one point, Stonebraker was regularly claiming that "crabbing" of
buffer locks in B-Trees was a fundamental overhead paid in systems
more or less based on System R. He did eventually start to acknowledge
that Lehman and Yao figured out a technique that made that untrue in
1981, if only barely [1], but the lesson for me was to take his claims
in this area with a generous pinch of salt.

[1] https://www.cs.cmu.edu/~pavlo/static/papers/stonebraker-ic2e2014.pdf
(See his citation 11)
-- 
Peter Geoghegan


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] ON CONFLICT does not support deferrable unique constraints

2016-08-25 Thread Peter Geoghegan
On Thu, Aug 25, 2016 at 12:29 AM, Francisco Olarte
<fola...@peoplecall.com> wrote:
> That been said, I'm not sure making it ( deferred constraint act like
> immediate ones during upserts ) work is even a good idea. If it can be
> conditionally enabled with a simple set and implemented in very few (
> < 20 ) lines of code, ok for me , otherwise I would prefer the reduced
> bug surface.

But UPSERT doesn't support a DEFERRABLE constraint, even when the
system is set to IMMEDIATE enforcement of constraints. I think that
it's not hard to see that UPSERT on a deferred (not just DEFERRABLE)
constraint doesn't make much sense, but that's not how I understood
the OP's complaint.

IMMEDIATE enforcement just makes the second pass for verification
occur immediately, rather than at the end of the transaction. That's a
very small difference, at least from an implementation perspective; in
particular, the constraint will not actually behave like (say) a
uniqueness constraint that was not declared DEFERRABLE in the first
place. Enforcement will not occur in the path of insertion, as it does
for B-Tree.

The reason for the broad restriction on DEFERRABLE constraints is that
it's not clear how the implementation of UPSERT should handle
*somebody else's* unconfirmed DEFERRABLE-mode insertion (this may
happen even if the upserter is forbidden from treating the constraint
as DEFERRABLE). I'm not saying that it's impossible, but it's far more
complicated than it first appears, if you expect UPSERT to worry about
lock starvation, "unprincipled deadlocks" [1], and other problems like
that.

[1] 
https://wiki.postgresql.org/wiki/Value_locking#.22Unprincipled_Deadlocking.22_and_value_locking
-- 
Peter Geoghegan


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Database Architect - Voleon Capital Management LP

2016-07-14 Thread Peter Geoghegan
On Wed, Jul 13, 2016 at 1:39 PM, weitzer <char...@voleon.com> wrote:
> Database Architect
> Berkeley
> IT
> Full-time
>
> The Voleon Group is a technology-driven investment firm employing
> cutting-edge machine learning techniques seeks an exceptionally capable
> Database Architect. Your responsibilities will include making technical
> decisions with respect to relational database technologies; deploying,
> configuring and maintaining database systems; working with software
> engineers to work through performance problems and optimize database
> performance; and support database replication and backup processes.

This belongs on the pgsql-jobs mailing list, not pgsql-general.

-- 
Peter Geoghegan


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Upsert with a partial unique index constraint violation

2016-07-11 Thread Peter Geoghegan
On Mon, Jul 11, 2016 at 12:06 AM, Tim Dawborn <tim.dawb...@gmail.com> wrote:
> tmp=# INSERT INTO foo (a, b, c, d) VALUES (1, 2, 'four', true)
> tmp-# ON CONFLICT (a, b) WHERE d = true
> tmp-# DO UPDATE SET c = 'four' WHERE foo.a = 1 AND foo.b = 2 AND foo.d =
> true;
> ERROR:  there is no unique or exclusion constraint matching the ON CONFLICT
> specification
>
> If anyone knows what I'm doing wrong and how to get this to work, or knows
> that this is not possible to achieve, I'm all ears.

That should work. Are you sure you haven't spelled it "... WHERE d IS TRUE"?

-- 
Peter Geoghegan


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Proper relational database?

2016-04-21 Thread Peter Geoghegan
On Thu, Apr 21, 2016 at 1:36 PM, Guyren Howe <guy...@gmail.com> wrote:
> If I had a few $million to spend in a philanthropical manner, I would hire 
> some of the best PG devs to develop a proper relational database server. 
> Probably a query language that expressed the relational algebra in a 
> scheme-like syntax, and the storage model would be properly relational (eg no 
> duplicate rows).

Have you heard of QUEL?

See https://en.wikipedia.org/wiki/QUEL_query_languages

-- 
Peter Geoghegan


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] ON CONFLICT DO NOTHING RETURNING

2016-03-18 Thread Peter Geoghegan
On Fri, Mar 18, 2016 at 9:14 AM, Jeff Janes <jeff.ja...@gmail.com> wrote:
> He wants to retrieve a value from the conflicting row.  Now getting
> the value that caused the conflict should be easy, because you
> provided it in the first place.   But he wants a value from a
> different column of the conflicting row than the column(s) on which
> there is conflict.  DO NOTHING RETURNING returns no rows.  Which is
> reasonable, because nothing was inserted.  But it isn't what he wants.

I see.

> I think the dummy update is his best bet, but it does seem like there
> should be a better way.  Maybe ON CONFLICT DO SELECT where the select
> operates over the target row.

Seems reasonable.


-- 
Peter Geoghegan


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Issues with german locale on CentOS 5,6,7

2015-10-07 Thread Peter Geoghegan
On Wed, Oct 7, 2015 at 8:06 PM, Thomas Munro
<thomas.mu...@enterprisedb.com> wrote:
>> I think we should bite the bullet and adopt ICU, without abandoning
>> support for OS locales for users that really need it (certainly, many
>> will need it initially when using pg_upgrade to get on to the first
>> version that happens to have ICU support). I don't like suggesting a
>> solution that I myself am unlikely to find the time to work on, but in
>> the long run that's the only sensible approach IMV.
>
> How would you handle changes in ICU's collation definitions?

ICU provides an API for collation versioning because of these kinds of
issues with indexes:
http://userguide.icu-project.org/collation/architecture#TOC-Versioning

There are specifications of collations used by ICU that originate from
the Unicode CLDR Project: http://cldr.unicode.org/

Basically, you prevent this kind of thing from ever happening in the
first place by making versioning explicit, and putting it under the
direct control of Postgres. I think a bunch of well regarded database
systems have used ICU for many years, including DB2, for example. I
think it's possible to arrange it so that the collations simply never
go away, but if that does happen (or if you decide that the changes to
a collation matter for cultural or correctness reasons) then you can
at least detect the change and recover from it reliably.

ICU has some other really nice features, too, but that's another discussion.

-- 
Regards,
Peter Geoghegan


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Issues with german locale on CentOS 5,6,7

2015-10-07 Thread Peter Geoghegan
On Wed, Oct 7, 2015 at 8:39 PM, Tom Lane <t...@sss.pgh.pa.us> wrote:
> 1. Being compatible with the operating system's collation behavior is a
> feature, not a bug.  If nothing else, it allows us to tell people that
> if we sort data the same way that sort(1) does, then it's not a bug that
> we're not sorting the way they think we should.  But quite aside from
> that, there are practical uses to being compatible with other tools.

I am not proposing to make that impossible.

> 2. Last I checked, ICU *only* supports Unicode, and not only that, but
> only UTF16.  This is a non-starter; not only for our Far Eastern users,
> but also those who find various LatinX encodings sufficient.  ICU would be
> a functional fail for the former and a performance fail for the latter.

UTF-16 is more efficient for representing East Asian languages, so not
sure what you mean about that. I realize that using UTF-16 is a
non-starter, though.

I guess you were talking about people who don't use Unicode due to the
Han Unification controversy. Again, I'm not proposing to only support
Unicode, but realistically the vast majority of users want Unicode,
even in East Asia.

Yes, ICU only supports Unicode, but it has supported UTF-8 for years
now, and not as a second class citizen. See
http://userguide.icu-project.org/strings/utf-8 . As it says there:

"""
If it is known that the default charset is always UTF-8 on the target
platform, then you should #define U_CHARSET_IS_UTF8 1 in or before
unicode/utypes.h. (For example, modify the default value there or pass
-DU_CHARSET_IS_UTF8=1 as a compiler flag.) This will change most of
the implementation code to use dedicated (simpler, faster) UTF-8 code
paths and avoid dependencies on the conversion framework. (Avoiding
such dependencies helps with statically linked libraries and may allow
the use of UCONFIG_NO_LEGACY_CONVERSION or even UCONFIG_NO_CONVERSION
[see unicode/uconfig.h].)

"""

> 3. As Thomas Munro already noted, whatcha gonna do when ICU changes their
> collations?  Or are their collations graven on stone tablets, unlike
> anyone else's?

See my response to Thomas.

-- 
Regards,
Peter Geoghegan


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Issues with german locale on CentOS 5,6,7

2015-10-07 Thread Peter Geoghegan
On Wed, Oct 7, 2015 at 4:49 PM, Thomas Munro
<thomas.mu...@enterprisedb.com> wrote:
>> I agree that that would be almost as bad as carrying on, because there
>> is no reason to think that the locale thing can easily be rolled back.
>> That was my point, in fact.
>
> I have contemplated a maintenance script that would track either the
> md5 checksums of the /usr/lib/locale/*/LC_COLLATE files or the version
> of installed locale packages and automatically reindex things when
> they change (I guess after restarting the cluster to clear any glibc
> caches that might be lurking in long running backends).  Or at least
> tell me that's needed.  Obviously completely OS-specific...

I think we should bite the bullet and adopt ICU, without abandoning
support for OS locales for users that really need it (certainly, many
will need it initially when using pg_upgrade to get on to the first
version that happens to have ICU support). I don't like suggesting a
solution that I myself am unlikely to find the time to work on, but in
the long run that's the only sensible approach IMV.

-- 
Regards,
Peter Geoghegan


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Issues with german locale on CentOS 5,6,7

2015-10-07 Thread Peter Geoghegan
On Wed, Oct 7, 2015 at 2:20 AM, Bernd Helmle <maili...@oopsware.de> wrote:
> The last day we've encountered an issue what i think is somewhat severe if
> you want to do either OS upgrades with CentOS or even binary upgrades with
> an existing PostgreSQL instance to a new machine with locale de_DE.UTF-8
> and thus i'd like to share here.

At the risk of getting flamed: I think that this is a bug in
PostgreSQL, not CentOS. I've said why I think that is at least once
already [1]. Simply put, there is no justification for the belief that
some people have that collations should be immutable, and there is
some justification for OS vendors updating them in a way that
disregards our use of the collations. Glibc actually versions
collations, although that isn't in any standard, and it isn't clear
that we can benefit from that beyond refusing to start the server when
a collation was superseded by a new version.

[1] 
http://www.postgresql.org/message-id/CAEYLb_UTMgM2V_pP7qnuKZYmTYXoym-zNYVbwoU79=tup8h...@mail.gmail.com
-- 
Regards,
Peter Geoghegan


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Issues with german locale on CentOS 5,6,7

2015-10-07 Thread Peter Geoghegan
On Wed, Oct 7, 2015 at 6:25 AM, Tom Lane <t...@sss.pgh.pa.us> wrote:
> The only real way out of such a situation is to REINDEX affected indexes.
> Refusing to start the server not only doesn't contribute to a solution,
> but makes it impossible to fix manually.

I agree that that would be almost as bad as carrying on, because there
is no reason to think that the locale thing can easily be rolled back.
That was my point, in fact.

-- 
Regards,
Peter Geoghegan


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Sensitivity to drive failure?

2015-10-02 Thread Peter Geoghegan
I think it would be really handy if temp_tablespaces were made resilient
against everything going away in the event of a hard crash.

--
Regards,
Peter Geoghegan


Re: [GENERAL] Very puzzling sort behavior

2015-09-10 Thread Peter Geoghegan
On Thu, Sep 10, 2015 at 12:51 PM, Ken Tanzer <ken.tan...@gmail.com> wrote:
> OK, can one of you help me out in understanding this?  I would have thought 
> that given "CLARK," and "CLARKE" that the comma would get compared against 
> the E and come first.  End of story, before we even get to anything farther 
> in the string.  What am I missing?

That's only how it works with the C locale. Otherwise, there are
complicated rules to weigh things like space and punctuation (and
accents/diacritics) less prominently than primary alphabetical
ordering. This is often useful. Anyway, based on what you say here, I
think you should actually "ORDER BY name_last, name_first".

-- 
Regards,
Peter Geoghegan


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Very puzzling sort behavior

2015-09-10 Thread Peter Geoghegan
On Thu, Sep 10, 2015 at 12:35 PM, Ken Tanzer <ken.tan...@gmail.com> wrote:
> Any thoughts about what's going on, what to do about it, or what obvious 
> point I missing?  Thanks in advance!

This is the expected behavior. Locale rules will weigh the punctuation
character and space you added after primary alphabetical ordering.
Therefore, the 'E' in 'CLARKE' is more or less compared with the first
character of the first name in the case that you show, because those
code points are both what Unicode calls "primary weights".

You're not displaying the string you're ordering by anyway, so you can
just not concatenate the ', ' string (so name_last || name_first), and
it will work as you expect, I believe.

-- 
Regards,
Peter Geoghegan


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Weird insert issue

2015-06-27 Thread Peter Geoghegan
On Sat, Jun 27, 2015 at 9:47 PM, Pavel Stehule pavel.steh...@gmail.com wrote:
 you can protect it against this issue with locking - in this case you can
 try for update clause

 http://www.postgresql.org/docs/9.4/static/explicit-locking.html

 insert into Favorite (patronId, titleId)
 select 123, 234
 where not exists (
   select 1 from Favorite where patronId = 123 and titleId = 234 for update
 )

That won't work reliably either -- a SELECT ... FOR UPDATE will still
use an MVCC snapshot. The looping + subxact pattern must be used [1]
if a duplicate violation isn't acceptable. ON CONFLICT DO UPDATE
should be preferred once 9.5 is released.

[1] 
http://www.postgresql.org/docs/9.4/static/plpgsql-control-structures.html#PLPGSQL-UPSERT-EXAMPLE
-- 
Regards,
Peter Geoghegan


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Unicode license compatibility with PostgreSQL license

2015-03-19 Thread Peter Geoghegan
On Wed, Mar 18, 2015 at 11:03 PM, Haribabu Kommi
kommi.harib...@gmail.com wrote:
 For our next set of development activities in PostgreSQL, we want to
 use the Unicode organization code with PostgreSQL to open source that
 feature. Is the Unicode license is compatible with PostgreSQL.

Do you mean that you'd like to add ICU support? I think that would be
extremely interesting, FWIW. The stability of ICU collations would be
quite helpful from a number of different perspective. One of which is
that having a contract about the stability of strxfrm()-style binary
keys would allow me to make text abbreviated keys exploited in the
internal pages of B-Tree indexes, to greatly reduce cache misses with
index scans on text attributes. This general technique already been
very effective with sorting [1], but it feels likely that we'll need
ICU to make the abbreviation technique useful for indexes.

[1] 
http://pgeoghegan.blogspot.com/2015/01/abbreviated-keys-exploiting-locality-to.html
-- 
Regards,
Peter Geoghegan


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Getting truncated queries from pg_stat_statements

2015-01-19 Thread Peter Geoghegan
On Mon, Jan 19, 2015 at 2:37 PM, Tim Uckun timuc...@gmail.com wrote:
 I am trying to get some slow query information and the results from my query
 are truncated at 2047 characters.   Some of my queries are very long so they
 get truncated in the select. Is there a way around this?


Yes. This is fixed in PostgreSQL 9.4, so that query texts can be of
virtually unlimited size and still be stored. Otherwise, it depends on
your track_activity_query_size setting.

-- 
Regards,
Peter Geoghegan


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Weird CPU utilization patterns with Postgres

2014-12-09 Thread Peter Geoghegan
On Fri, Dec 5, 2014 at 5:14 PM, István lecc...@gmail.com wrote:
 I am wondering why the root cause might be here.

My guess would be that an important text-based sort operation began to
go to disk. The external sort code (tapesort) is known to do far more
comparisons than quicksort. With text sorts, you tend to see tapesort
very CPU bound, where that might not be the case with integer sorts.

I'm currently trying to fix this across the board [1], but my first
suggestion is to try enabling log_temp_files to see if external sorts
can be correlated with these stalls.

[1] https://commitfest.postgresql.org/action/patch_view?id=1462
-- 
Regards,
Peter Geoghegan


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Weird CPU utilization patterns with Postgres

2014-12-09 Thread Peter Geoghegan
On Tue, Dec 9, 2014 at 5:46 PM, Peter Geoghegan
peter.geoghega...@gmail.com wrote:
 I'm currently trying to fix this across the board [1], but my first
 suggestion is to try enabling log_temp_files to see if external sorts
 can be correlated with these stalls.

See also: 
http://www.postgresql.org/message-id/cam3swztijobppqff7mn3021vvtu+5fd1ymabq8tlov4zhfa...@mail.gmail.com

-- 
Regards,
Peter Geoghegan


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] spgist index not getting used

2014-09-24 Thread Peter Geoghegan
On Wed, Sep 24, 2014 at 2:01 PM, Paul Ramsey pram...@cleverelephant.ca wrote:
 If I build an index on the same table using the internal quad-tree ops, and
 use their operator, I do get an index scan.


What about when enable_seqscan = off?

-- 
Regards,
Peter Geoghegan


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] unique index on embedded json object

2014-09-21 Thread Peter Geoghegan
On Sun, Sep 21, 2014 at 5:23 AM, Lee Jason j...@hotmail.com wrote:
 {id: 12345,
 bags: [{
 sku: abc123,
 price: 0,
 },
 {
 sku: abc123,
 price: 0,
 }]
 }

That's invalid JSON - there are stray commas.

 However, I want sku of bags to be unique.  It means the json can't be
 inserted into products tables because sku is not unique in this case.

Unique indexes are useful for ensuring that each row is unique based
on some criteria, possibly a quite complex expression. However, it's
not clear what behavior is expected here, since multiple SKUs may
appear per row. If you had a table with products, with a jsonb
column, and one row per product, you could then usefully extract at
most one SKU per row, and that could work fine. But the structure you
present isn't amenable to that.

-- 
Regards,
Peter Geoghegan


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] 64-bit TXID?

2014-09-18 Thread Peter Geoghegan
On Thu, Sep 18, 2014 at 9:17 PM, cowwoc cow...@bbs.darktech.org wrote:
 I did a quick search and noticed that 8 years ago, a Skype employee provided
 a patch for migrating to a 64-bit TXID:
 http://postgresql.1045698.n5.nabble.com/RFC-txid-module-for-64-bit-external-transaction-IDs-tt1947503.html

Actually, this patch was supposed to:


The goal is to make PostgreSQL internal transaction ID and snapshot
data usable externally.  They cannot be used directly as the
internal 4-byte value wraps around and thus breaks indexing.


This stuff was committed to PostgreSQL in essentially the same form years ago:

http://www.postgresql.org/docs/devel/static/functions-info.html#FUNCTIONS-TXID-SNAPSHOT

It's just a way of getting a stable 64-bit xid value for external use,
by adding an epoch count to the representation. Nothing more.

The most obvious reason for not using 64-bit xid values is that they
require more storage than 32-bit values. There is a patch floating
around that makes it safe to not forcibly safety shutdown the server
where currently it is necessary, but it doesn't work by making xids
64-bit.

-- 
Regards,
Peter Geoghegan


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] jsonb and comparison operators

2014-09-02 Thread Peter Geoghegan
On Tue, Sep 2, 2014 at 9:38 PM, Joe Van Dyk j...@tanga.com wrote:
 I want to return all rows that have a value of less than 10. I have
 arbitrary keys I want to check (not just 'a').


If you created an expression B-Tree index on 'a' it would work for
'a', but you'd have to use a jsonb literal, not a json/int4 literal.
If you want to be able to query every key at the top nesting level of
an object, such that all rows are returned with jsonbs that have
object values of which in each case one of them is, say, below 10,
then that's something that no existing opclass can support. But, why
should it be supported? That's a very fuzzy criteria to search on.

-- 
Regards,
Peter Geoghegan


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] how to query against nested hstore data type

2014-08-24 Thread Peter Geoghegan
On Sun, Aug 24, 2014 at 7:05 PM, Huang, Suya suya.hu...@au.experian.com wrote:
 It’s my test DB, the version is PostgreSQL 9.4beta2. I’m testing with nested
 hstore feature.

Really? Nested hstore only made it into that version as jsonb, which
you're clearly not using here.


-- 
Regards,
Peter Geoghegan


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] JsonB Gin Index is very large; is there a work around?

2014-08-20 Thread Peter Geoghegan
On Wed, Aug 20, 2014 at 1:53 PM, Larry White ljw1...@gmail.com wrote:
 Is there anyway to index a subset of the data in a JSONB column? I'm
 thinking of something like declaring certain paths to be indexed?


Yes. See the expression index example in the jsonb documentation.

-- 
Regards,
Peter Geoghegan


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] The dangers of streaming across versions of glibc: A cautionary tale

2014-08-07 Thread Peter Geoghegan
On Thu, Aug 7, 2014 at 9:46 AM, Bruce Momjian br...@momjian.us wrote:
 We could walk the index looking for inconsistent btree splits, e.g. the
 split doesn't match the ordering returned by the existing collation
 functions.

I'm not sure I follow. I don't think that a tool like my btreecheck
tool will necessarily be able to catch anything like this on a
standby. Maybe it will, but that isn't guaranteed. For example, the
difference in collation rules in question might just not have cropped
up yet, but it's still a ticking time-bomb. Or, there are only
differences affecting values on internal pages. Things break down very
quickly.

In general, once there is an undetected inconsistency in collations
between replicas, that means that the varlena B-Tree support function
number 1 can violate various invariants that all operator classes must
obey. I doubt we want to get into the business of working backwards
from a broken state of affairs like that to figure out there is a
problem. Rather, I really do think we're compelled to offer better
versioning of collations using a versioning interface like Glibc's
LC_IDENTIFICATION. There is no way other way to properly fix the
problem. This is a problem that is well understood, and anticipated by
the Unicode consortium.

-- 
Regards,
Peter Geoghegan


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] The dangers of streaming across versions of glibc: A cautionary tale

2014-08-06 Thread Peter Geoghegan
On Wed, Aug 6, 2014 at 5:11 PM, Bruce Momjian br...@momjian.us wrote:
 No surprise;  I have been expecting to hear about such breakage, and am
 surprised we hear about it so rarely.  We really have no way of testing
 for breakage either.  :-(

I guess that Trip Advisor were using some particular collation that
had a chance of changing. Sorting rules for English text (so, say,
en_US.UTF-8) are highly unlikely to change. That might be much less
true for other locales.

Unicode Technical Standard #10 states:


Collation order is not fixed.

Over time, collation order will vary: there may be fixes needed as
more information becomes available about languages; there may be new
government or industry standards for the language that require
changes; and finally, new characters added to the Unicode Standard
will interleave with the previously-defined ones. This means that
collations must be carefully versioned.


So, the reality is that we only have ourselves to blame.  :-(

LC_IDENTIFICATION serves this purpose on glibc. Here is what en_US
looks like on my machine:


escape_char /
comment_char %
% Locale for English locale in the USA
% Contributed by Ulrich Drepper drep...@redhat.com, 2000

LC_IDENTIFICATION
title  English locale for the USA
source Free Software Foundation, Inc.
address59 Temple Place - Suite 330, Boston, MA 02111-1307, USA
contact
email  bug-glibc-loca...@gnu.org
tel
fax
language   English
territory  USA
revision   1.0
date   2000-06-24
%
category  en_US:2000;LC_IDENTIFICATION
category  en_US:2000;LC_CTYPE
category  en_US:2000;LC_COLLATE
category  en_US:2000;LC_TIME
category  en_US:2000;LC_NUMERIC
category  en_US:2000;LC_MONETARY
category  en_US:2000;LC_MESSAGES
category  en_US:2000;LC_PAPER
category  en_US:2000;LC_NAME
category  en_US:2000;LC_ADDRESS
category  en_US:2000;LC_TELEPHONE
*** SNIP ***


This is a GNU extension [1]. If the OS adds a new version of a
collation, that probably accidentally works a lot of the time, because
the collation rule added or removed was fairly esoteric anyway, such
is the nature of these things. If it was something that came up a lot,
it would surely have been settled by standardization years ago.

If OS vendors are not going to give us a standard API for versioning,
we're hosed. I thought about suggesting that we hash a strxfrm() blob
for about 2 minutes, before realizing that that's a stupid idea. Glibc
would be a good start.

[1] 
https://www.gnu.org/software/autoconf/manual/autoconf-2.63/html_node/Special-Shell-Variables.html
-- 
Regards,
Peter Geoghegan


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] The dangers of streaming across versions of glibc: A cautionary tale

2014-08-06 Thread Peter Geoghegan
On Wed, Aug 6, 2014 at 6:30 PM, Tatsuo Ishii is...@postgresql.org wrote:
 Another idea could be having our own collation data to isolate any
 changes from outside world. I vaguley recall this had been discussed
 before.

That's probably the best solution. It would not be the first time that
we decided to stop relying on the operating system's facilities due to
various problems (e.g. we used to use the C standard library qsort()
until about 2006). The only problem is that it's a lot of work. One
possible solution that has been proposed is to adopt ICU [1]. That
might allow us to say this is the official way that PostgreSQL 9.6
sorts Japanese; you may use the old way if you want, but it's
incompatible with the new way. ICU would give us a standard
versioning interface [2]. They seem to take this seriously, and are
aware of our considerations around B-Tree indexes on text.

[1] https://wiki.postgresql.org/wiki/Todo:ICU
[2] http://userguide.icu-project.org/collation/architecture#TOC-Versioning
-- 
Regards,
Peter Geoghegan


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] What query currently running within function

2014-07-23 Thread Peter Geoghegan
On Tue, Jul 22, 2014 at 2:45 AM, Guillaume Lelarge
guilla...@lelarge.info wrote:
 Unfortunately, no. Even with the latest release. pg_stat_activity shows you
 what the client fires, not what the server does.


pg_stat_statements has a track GUC which controls whether or not
nested statements, such as statements executed within functions are
tracked.

-- 
Regards,
Peter Geoghegan


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Xlogdump compiling error : undefined reference to `ber_sockbuf_io_udp'

2014-02-09 Thread Peter Geoghegan
On Sun, Feb 9, 2014 at 10:02 PM, Adarsh Sharma eddy.ada...@gmail.com wrote:
 I need to use xlogdump to analyse my xlog files in postgresql to understand
 why there are so much xlog generation on Master.


Only PostgreSQL is supported. EDB's product is not.

-- 
Regards,
Peter Geoghegan


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Why is wal_writer_delay limited to 10s?

2014-02-01 Thread Peter Geoghegan
On Sat, Feb 1, 2014 at 3:40 PM, Clemens Eisserer linuxhi...@gmail.com wrote:
 Something like wal_writer_delay=600s would be ideal, I can afford to
 loose a 10min of data, but can't afford to get a corrupted database
 after power loss.

What Postgres version? The WAL Writer will hibernate on Postgres 9.2+.
walwriter.c says:

/*
 * Number of do-nothing loops before lengthening the delay time, and the
 * multiplier to apply to WalWriterDelay when we do decide to hibernate.
 * (Perhaps these need to be configurable?)
 */
#define LOOPS_UNTIL_HIBERNATE   50
#define HIBERNATE_FACTOR25


-- 
Regards,
Peter Geoghegan


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Why is wal_writer_delay limited to 10s?

2014-02-01 Thread Peter Geoghegan
On Sat, Feb 1, 2014 at 3:48 PM, Peter Geoghegan
peter.geoghega...@gmail.com wrote:
 What Postgres version? The WAL Writer will hibernate on Postgres 9.2+.

Of course, it will still write things out at a steady rate if they're
there to write. However, there is no reason to think you couldn't
change things in guc.c to have the upper bound increased. You might
even propose a patch to do so.


-- 
Regards,
Peter Geoghegan


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Why is wal_writer_delay limited to 10s?

2014-02-01 Thread Peter Geoghegan
On Sat, Feb 1, 2014 at 10:49 PM, Clemens Eisserer linuxhi...@gmail.com wrote:
 My question on the list was merely to make sure there are no
 side-effects when increasing this delay above what seems to be
 considered safe limits. However, I still wonder why this parameter is
 capped to 10s and whether this restriction could be lifted in future
 postgresql versions?

I don't think there's any practical reason, other than that it was
assumed that increasing it further was not useful. There is perhaps a
tendency to set GUC limits as high as seems reasonable without
consider niche use-cases such as yours. If you want to hack it to go
higher it should be fine, provided that WalWriterDelay *
HIBERNATE_FACTOR cannot ever overflow a 32-bit signed integer. But
since those are milliseconds and not microseconds, it seems pretty
safe. This applies to 9.2+ only. I didn't check what things look like
back when the delay was passed to pg_usleep(), which was the case in
9.1.


-- 
Regards,
Peter Geoghegan


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] what checksum algo?

2013-11-13 Thread Peter Geoghegan
On Wed, Nov 13, 2013 at 4:39 PM, Michael Paquier
michael.paqu...@gmail.com wrote:
 CRC16 is used.

Actually, subsequently another algorithm was introduced - see commit
43e7a668499b8a69a62cc539a0fbe6983384339c .


-- 
Regards,
Peter Geoghegan


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] what checksum algo?

2013-11-13 Thread Peter Geoghegan
On Wed, Nov 13, 2013 at 5:53 PM, Tatsuo Ishii is...@postgresql.org wrote:
 It was good to see you in Japan.

Likewise.

 PostgreSQL Enterprise Consortium (non profit PostgreSQL related
 organization in Japan. http://www.pgecons.org) is about to inspect the
 performance impact of the checksum using High-end PC server (real 80
 cores with 2TB memory). What in my mind is using pgbench with custom
 query (purely SELECT). Is there any recommendations/suggestions in
 doing that?

 (The result will be in public of course).

Well, off the top of my head I would of course be sure to build
Postgres to take advantage of this:

 * Vectorization of the algorithm requires 32bit x 32bit - 32bit integer
 * multiplication instruction. As of 2013 the corresponding instruction is
 * available on x86 SSE4.1 extensions (pmulld) and ARM NEON (vmul.i32).
 * Vectorization requires a compiler to do the vectorization for us. For recent
 * GCC versions the flags -msse4.1 -funroll-loops -ftree-vectorize are enough
 * to achieve vectorization.

Unfortunately I have no idea what packagers are currently doing about
this. Could you please enlighten me, Devrim?

It also occurs to me that pgbench will be pretty unsympathetic to
checksums as compared to a non-checksummed baseline here, because of
course as always it uses a uniform distribution, and that's going to
literally maximize the amount of verification that must occur. Maybe
that's something you're interested in, because you want to
characterize the worst case. If the average case is more interesting,
you could try applying this patch:

https://commitfest.postgresql.org/action/patch_view?id=1240

I don't know if the patch is any good, having not looked at the code,
but surely as the original author of pgbench you are eminently
qualified to judge this. I think that in general I prefer a uniform
distribution, because most often I look to pgbench to satisfy myself
that certain types of regressions have not occurred. That's quite a
different thing to a representative workload, obviously.

-- 
Regards,
Peter Geoghegan


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Call for design: PostgreSQL mugs

2013-09-06 Thread Peter Geoghegan
On Thu, Sep 5, 2013 at 8:22 AM, Merlin Moncure mmonc...@gmail.com wrote:
 I'm still partial to this guy:

 http://www.commandprompt.com/images/mammoth_versus_dolphin_500.jpg


I dislike that image, and always have. Apart from risking alienating
people, I think it sends the wrong message - that we define ourselves
in terms of what others do and don't do. I hope that isn't true, but
if it is I think that MySQL is much less relevant now.

-- 
Regards,
Peter Geoghegan


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Call for design: PostgreSQL mugs

2013-09-03 Thread Peter Geoghegan
On Tue, Sep 3, 2013 at 3:08 PM, Andreas 'ads' Scherbaum
adsm...@wars-nicht.de wrote:
 We are looking for the next big thing. Actually, it's a bit smaller: a new
 design for mugs. So far we had big blue elephants, small blue elephants,
 frosty elephants, white SQL code on black mugs ... Now it's time to design
 something new.

Maybe this is a silly thing to say, but no more frosted glass cups,
please. They tended to quickly stain when I drank tea from them, in a
way that was particularly hard to clean. The black psql cups, on the
other hand, are generally really nice to drink from.

-- 
Regards,
Peter Geoghegan


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PostrgreSQL Commercial restrictions?

2013-08-07 Thread Peter Geoghegan
On Wed, Aug 7, 2013 at 8:23 AM, Eliseo Viola eliseo.vi...@vodemia.com wrote:
 I have been reading the -http://opensource.org/licenses/postgresql- to know
 if i can use PostgreSQL in a Privative Closed Commercial Software  (The
 worst of the worst :P ). in the company where i work.
 Is there any restriction or limit to do it.?

I am not a lawyer and this isn't legal advice. That said, I have ever
reason to believe that the licensing is the most liberal practically
possible. Distributing Postgres with your proprietary application
should not be a problem.


-- 
Regards,
Peter Geoghegan


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Read data from WAL

2013-07-15 Thread Peter Geoghegan
On Mon, Jul 15, 2013 at 1:45 PM, Baldur Þór Emilsson bal...@baldur.biz wrote:
 Are there any projects or standard procedures for reading the data from the
 WAL to get a change log for the database (or without the WAL, using some
 other method)? I have searched for information about this quite thoroughly
 without luck, so I thought I'd try asking here before I started to patch
 Postgres :)

I think that pg_xlogdump is what you're looking for:

http://www.postgresql.org/docs/9.3/static/pgxlogdump.html

It only became available with Postgres 9.3. On prior versions, you can use:

https://github.com/snaga/xlogdump

-- 
Regards,
Peter Geoghegan


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Can't increase shared_buffers for PostgreSQL on openSUSE 12.3

2013-06-14 Thread Peter Geoghegan
On Fri, Jun 14, 2013 at 11:55 AM, Andreas maps...@gmx.net wrote:
 How can I get more memory for PG on openSUSE 12.3 ?

http://www.postgresql.org/docs/9.2/static/kernel-resources.html


-- 
Regards,
Peter Geoghegan


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Age of the WAL?

2013-03-26 Thread Peter Geoghegan
On 26 March 2013 22:21, Tom Lane t...@sss.pgh.pa.us wrote:
 The version recently added to contrib is only meant to work with the
 current server release, AFAIK.  However, it's derived from older
 standalone programs that are out there somewhere --- did you look around
 on pgfoundry?

Actually, I think the version on pgfoundry is unmainted.

I'd look here instead:

https://github.com/snaga/xlogdump/commits/master


-- 
Regards,
Peter Geoghegan


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Is there any plugin for Nagios, which sends Postgresql ERROR information with SQL query in mail?

2013-01-02 Thread Peter Geoghegan
On 2 January 2013 16:55, AI Rumman rumman...@gmail.com wrote:
 Is there any plugin for Nagios, which sends Postgresql ERROR information
 with SQL query in mail?

It has nothing to do with Nagious, but tail_n_mail may do what you want here.

-- 
Peter Geoghegan   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] What happens to a primary key b-tree index when a table tuple is deleted by a transaction?

2012-11-25 Thread Peter Geoghegan
On 25 November 2012 18:03, Luby Liao lubyl...@gmail.com wrote:
 If the b-tree changes for the transaction, would it not become broken for
 other transactions?
 Can anyone  tell me how Postgres handles this?  Thank you, Luby

Unfortunately, that book is a little out of date.

Even with a unique index, there can simultaneously be duplicate row
versions (I emphasize the distinction between logical rows and
physical row versions), provided that no two duplicate values are
simultaneously visible to a snapshot - they cannot exist at the same
time. MVCC doesn't modify rows in place; in creates new row
versions. So, just as with tables, btree indexes will have multiple
row versions for the same logical row.

There is one notable exception to this, though. There was an
optimisation added to PostgreSQL 8.3 called HOT, or heap-only tuples.
This optimisation allows Postgres to use clever tricks to get away
with only having a new row version in tables, and not in each index,
if and only if the UPDATE statement only affects non-indexed columns.

-- 
Peter Geoghegan   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Experiences with pl/Java

2012-11-19 Thread Peter Geoghegan
On 19 November 2012 08:02, Thomas Hill thomas.k.h...@t-online.de wrote:
 was wondering if there is anyone wanted to share some experiences gained and
 some knowledge on pl/Java. Have looked into it for a couple of days now and
 am getting the impression it is not something ready to use in production
 environment. Also have trouble sending to the developer mailing list (the
 last email on the mail archive of that list is almost three weeks old) which
 raises questions on how active this project is.

I'd caution you against using pl/java in production. I came up against
a problem with it that I could not find a satisfactory solution for.

Essentially, each Postgres backend (connection process) must start its
own JVM, and does so using the JNI. If native memory runs out (I
realise that this may not actually be due to malloc() returning NULL,
but the effect is about the same), this failure is handled rather
poorly. It results in an OutOfMemoryError due to native memory
exhaustion. This results in a segfault of the Postgres backend,
originating from within libjvm.so. There is a workaround - which is to
set the maximum JVM heap size to a sufficiently low value - but in
general the need to do so left me with a very low opinion of pl/java
as a project.

-- 
Peter Geoghegan   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Maintaining state across function calls

2012-11-19 Thread Peter Geoghegan
On 20 November 2012 01:30, Craig Ringer cr...@2ndquadrant.com wrote:
 Otherwise you'll have to translate error handling mechanisms at every
 boundary between C++ and Pg code, something I'm not even certain is
 possible to do reliably.

I think it's probably the case that PLV8 is the most mature example of
wrapping a C++ library that is liable to throw C++ exceptions within
Postgres backend code, in a sane way (that is, avoiding unwinding the
stack via longjmp() over a part of the stack where a destructor needs
to be called, which is undefined in C++).

-- 
Peter Geoghegan   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Fuzzystrmatch contrib module on RHEL63

2012-11-18 Thread Peter Geoghegan
On 19 November 2012 02:13, Ike Nnabugwu nnabu...@gmail.com wrote:
 I am building an rpm package fuzzystrmatch to be installed on RHEL63 but  I
 do not know  where to get the modules for it.I will appreciate any pointers
 that will assist with this task.

Do you mean that you need to install fuzzystrmatch? I think that you
should be able to install it by installing the package
postgresql-contrib or similar.

-- 
Peter Geoghegan   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Pg isolation levels: 3 or 2?

2012-11-03 Thread Peter Geoghegan
On 3 November 2012 23:10, Thalis Kalfigkopoulos tkalf...@gmail.com wrote:
 How is that three levels and not two? Read Uncommitted and Read Commited are
 the same. And Repeatable Reads don't allow phantom reads thus making them
 effectively the same as Serializable. No?

They're only equivalent to the extent that the SQL standard describes
the isolation levels (in terms of various anomalies that can or cannot
occur, including phantom reads). However, the SQL standard has nothing
to say about write-skew anomalies, which can introduce errors that are
not possible with actually serially executing transactions. The SQL
standard and every implementation other than Postgres don't completely
plug the leaks in the illusion of serial behaviour with snapshot
isolation/Postgres repeatable reads. The Wikipedia article on Snapshot
Isolation [1] may be informative here.

[1] http://en.wikipedia.org/wiki/Snapshot_isolation

-- 
Peter Geoghegan   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Why BgWriterDelay is fixed?

2012-10-29 Thread Peter Geoghegan
On 29 October 2012 07:17, 高健 luckyjack...@gmail.com wrote:
 But I also found  the following in postgresql.conf:
 #bgwriter_delay = 200ms # 10-1ms between rounds
 It is now comment .
 But according to the fixed code of  BgWriterDelay = 200, even when I update
 bgwriter_delay in postgresql.conf to a different value(eg 300ms),
 how can it ovewrite the fixed  200ms in bgwriter.c ?

That value is just a default, that we initialise the variable to as a
sort of a reminder. We do this all over the place. Setting
bgwriter_delay downwards will alter the frequency of iterations.

The WaitLatch() thing is purely concerned with keeping the number of
wake-ups low. As long as your database server does not idle, the logic
is effectively exactly the same as what you'll see in 9.1, where the
event loop is much simpler, and merely sleeps exactly bgwriter_delay
ms every iteration.

-- 
Peter Geoghegan   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Access to postgresql query optimizer output

2012-10-29 Thread Peter Geoghegan
On 29 October 2012 16:00, Jeff Janes jeff.ja...@gmail.com wrote:
 The re-writing does not occur at the level of SQL, but rather at the
 level of an internal representation parse tree.

Right. The post parse-analysis tree (i.e. the Query struct) is
potentially rewritten by an explicit rewrite stage after
parse-analysis but immediately before planning. No rewriting of plans
actually occurs. This is part of the rules subsystem, which is
generally considered to be a foot-gun. It *is* mostly just as if the
query had been silently rewritten, and had its SQL magically changed.

Craig mentioned that two distinct queries can produce the same query
plan. That's true, if a little academic. I guess the best example of
that is with join syntax. Technically, these could be two distinct
queries, in that the query trees would be substantively different
(according to pg_stat_statements own definition of that, which is
essentially that they're not bitwise identical in respect of their
nodes' substantive fields):

select * from foo f join bar b on f.bid = b.bid;

select * from foo f, bar b where f.bid = b.bid;

However, after planning, they could indeed have identical plans.
Still, those plans don't relate to some alternative query to the one
provided, assuming rule expansion didn't occur, and if rule expansion
did occur, I think you could just pretend that you'd written the SQL
that way in the first place and it would work just the same.

There might be some other instances where two distinct queries have
identical plan tree structures. Consider these two:

select upper(lower('text'));
select upper(upper('text'));

So what actually occurs is that the underlying pg_proc accessible
functions (upper() and lower()) are actually executed in
preprocess_expression() during planning, not execution proper. All
that appears within the plan tree is a simple Const node in each case,
so in that sense the plans are identical. Again, this is a little bit
academic for the most part.

I highly recommend pretending that the rules system doesn't exist.

-- 
Peter Geoghegan   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Game Server Lags

2012-10-02 Thread Peter Geoghegan
On 2 October 2012 15:02, Arvind Singh arvin...@hotmail.com wrote:
 we are all aware of the popular trend of MMO games. where players face each
 other live.
 My questions are focussed on reducing load on Game database or Sql queries

 a) How to control the surge of records into the GameProgress table. so that
 players get response quicker. The Server starts to lag at peak hours or when
 1000 players are online

It sounds like this data is not that valuable, so a small window of
data loss in the event of a crash might be acceptable for a gain in
performance. If that is the case, consider turning off
synchronous_commit. Note that this is something that you can do at the
transaction granularity, so, for example, if there are financial
transactions, they need not make this trade-off.

-- 
Peter Geoghegan   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Postgres error when adding new page

2012-10-01 Thread Peter Geoghegan
On 1 October 2012 14:47, Marco Craveiro marco.crave...@gmail.com wrote:
 Hello Postgres general

 We're experiencing a lot of errors when using CDash on PostgreSQL 9.1,
 hosted on Mac OSX 10.6.8. The actual error message is as follows:

 SQL error in Cannot insert test:
 utility/asserter/assert_file_returns_true_for_empty_files into the
 database():ERROR:  failed to add old item to the right sibling while
 splitting block 191 of index crc323br

A call to PageAddItem(), made within _bt_pgaddtup(), is where this
failure seems to ultimately originate from. What we're missing here is
the reason for PageAddItem() returning InvalidOffsetNumber. That is
usually, though not necessarily, separately available within a WARNING
log message, which you haven't included here. Could you please let us
know if there is a WARNING that you didn't include just prior to the
ERROR?

-- 
Peter Geoghegan   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] The semantics of (NULL,NULL) vs NULL

2012-08-02 Thread Peter Geoghegan
On 2 August 2012 17:47, Ingmar Brouns swi...@gmail.com wrote:
 As I noted that (null,null) is null, I thought why put (null,null) in an
 array when that is the same as putting null in there.
 However, when trying to unnest that array I got an error when using null
 instead of the tuple. I experimented a bit, and
 read the documentation on row and array comparison, but I could not find
 documentation explaining the results below.

 create type int_tuple as (a int, b int);
 CREATE TYPE
 =# select (null,null)::int_tuple is null;
  ?column?
 --
  t
 (1 row)

Without commenting on the specifics of your test-case, I don't think
any thorough reading of the SQL standard would leave the reader with
the impression that the behaviour of SQL NULL is consistent with some
simple axiom that can be generalised from. I found this blogpost to be
insightful:

http://thoughts.j-davis.com/2009/08/02/what-is-the-deal-with-nulls/

-- 
Peter Geoghegan   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] I cannot insert bengali character in UTF8

2012-07-20 Thread Peter Geoghegan
On 20 July 2012 11:30, AI Rumman rumman...@gmail.com wrote:
 I am using database with UTF8 and LC_CTYPE set as default value in
 Postgresql 9.1.
 But I cannot insert bengali character in a column.

 Query Failed:INSERT into tracker (user_id, module_name, item_id,
 item_summary) values ('1','Leads','353','বাংলা টেস্��...')::ADODB
 error::-ERROR: invalid byte sequence for encoding UTF8: 0xe0a62e

 Item_summary is a text type column and we can insert japanese character in
 this field.

 Could anybody let me know what is the problem here?

Maybe they're not valid Bengali characters? Did you do an
encoding-naive truncation at some point?

My mail client cannot display the latter few characters before the
ellipsis, but can display the first few fine.

-- 
Peter Geoghegan   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Help me follow...

2012-07-14 Thread Peter Geoghegan
On 14 July 2012 16:57, Perry Smith pedz...@gmail.com wrote:
 I hit bug 5665.  I see threads like this:

 http://archives.postgresql.org/pgsql-bugs/2011-01/msg00241.php

 Where / how can I find the diff that went in to fix this?  Mostly I'm 
 curious.  I can work around the problem but wanted to create a patch file for 
 myself to make deploying to other servers easier.

You can search through commits here:

http://git.postgresql.org/gitweb/?p=postgresql.git;a=summary

Sometimes a reference to the bug number may appear within the commit
message, but this won't always happen. If you can't find the commit
based on that, try the original reporter's name. We ordinarily credit
bug reporters in commit messages that fix the bug.

-- 
Peter Geoghegan   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Help me follow...

2012-07-14 Thread Peter Geoghegan
On 14 July 2012 18:56, Tom Lane t...@sss.pgh.pa.us wrote:
 I can no longer reproduce a failure when building out-of-tree from a
 tarball on my Fedora 16 box.  That's still make 3.82, but Fedora is
 carrying half a dozen patches to the upstream tarball:
 http://pkgs.fedoraproject.org/gitweb/?p=make.git;a=tree;h=refs/heads/f16;hb=f16
 It's not immediately obvious which of those might be related, but
 I think this was a bug in gmake and the fix is in there somewhere.

Agreed. Since Fedora 14 is now EOL, I'm not inclined to worry about it.

-- 
Peter Geoghegan   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Problem while restoring a database from SQL_ASCII to UTF-8

2012-06-06 Thread Peter Geoghegan
On 6 June 2012 14:12, Adrian Klaver adrian.kla...@gmail.com wrote:
 SQL_ASCII is not an encoding, it is basically a way of saying encoding
 ignored. Given that, outside knowledge of the encoding used for data
 inserted into the database is needed to make the conversion to UTF8. In
 other words do you have any idea of what encoding was used by the clients
 that supplied data to the database?

You're going to have to manage this process carefully. I believe the
best tool for the job is pgloader:

http://pgfoundry.org/projects/pgloader/

It's certainly not inconceivable that this sort of tool could
automagically deduce encoding (albeit often inconclusively) in much
the same manor as modern web browsers attempt to when they have no
alternative. I don't see much demand for that though.

-- 
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] [pgeu-general] FOSDEM booth volunteer

2012-01-29 Thread Peter Geoghegan
Hi Ads,

On 29 January 2012 19:33, Andreas 'ads' Scherbaum adsm...@wars-nicht.de wrote:
 if you are attending FOSDEM next week in Brussels and are willing to help on
 the PostgreSQL booth, please respond to me directly.

 We are planning to print badges for everyone. In your reply, please include
 any (human) language which you are familiar with. There will be a small icon
 with the language on your badge.

I expect to be manning the booth. I speak English.

See you there
-- 
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] any plans to support more rounding methods in sql?

2012-01-24 Thread Peter Geoghegan
On 25 January 2012 05:41, Pavel Stehule pavel.steh...@gmail.com wrote:
 Hello

 2012/1/25 raf r...@raf.org:
 hi,

 i just needed to round some numbers down to 4 decimal places but a quick 
 search
 indicated that postgresql doesn't support all of the rounding methods so i 
 had
 to write this dreadful function:

Are you talking about always rounding down to the lower
smallest-possible-increment, rather than following standard rules for
rounding? That isn't such an esoteric use case - I believe that
financial regulations in some jurisdictions require just that when
calculating interest, for example.

If you require exactly 4 digits of precision, it's possible to use this syntax:

NUMERIC(precision, scale)

That isn't going to affect the rounding behaviour though.

-- 
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Is Synchronous Postgresql Replication Slower Than Asynchronous?

2012-01-20 Thread Peter Geoghegan
On 20 January 2012 21:17, John R Pierce pie...@hogranch.com wrote:
 On 01/20/12 12:31 PM, Jerry Richards wrote:

 Is synchronous postgresql replication slower than asynchronous?  If so,
 how much?  I am looking into database replication for a phone system, so the
 response time is of concern.


 when a client issues a COMMIT on the master, synchronous by definition has
 to wait for the replication to be committed on the slave.  this will of
 course be slower, how much slower depends on far too many variables to give
 an estimate, including things like speed of both servers disk systems, speed
 of the network communications, data volume and transaction rate.

Synchronous replication is, by its very nature, slower than
asynchronous replication, and that has nothing to do with the
PostgreSQL implementation in particular. The whole point of using it
is to ensure that transactions have an additional level of durability.
The master must wait for confirmation from the standby before
indicating the transaction committed, so latency/distance becomes a
real concern.

Uniquely, the PostgreSQL implementation doesn't require that all
transactions be synchronous or asynchronous; you can dynamically
change that right down to the transaction level (or the client, or the
database...). You might like to make really important transactions
alone be replicated synchronously for super-durability. Many
applications that use this feature only use it for a small though
critical subset of transactions, like financial transactions.

-- 
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Is Synchronous Postgresql Replication Slower Than Asynchronous?

2012-01-20 Thread Peter Geoghegan
On 20 January 2012 21:53, Jerry Richards jerry.richa...@teotech.com wrote:
 I noticed there are several synchronous implementations (Postgre-XC, 
 PGCluster, pgpool, rubyrep, built-in streaming, etc.).  When you say,  you 
 can dynamically change that right down to the transaction level..., are you 
 referring specifically to one of these implementations?

I refer specifically to the synchronous replication feature that was
introduced into PostgreSQL itself, in the current 9.1 release:

http://www.postgresql.org/docs/9.1/static/warm-standby.html#SYNCHRONOUS-REPLICATION

-- 
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] ESET NOD32 Antivirus interference with PostgreSQL

2012-01-20 Thread Peter Geoghegan
On 21 January 2012 00:45, Bruce Duncan bdun...@visualmining.com wrote:
 Thought this might be of help to anyone else out there who comes across this
 AV software during deployment and starts encountering strange behavior.

Thanks for the report, but shouldn't you really be complaining to the
anti-virus vendor?

-- 
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Is the a magic number for WAL files

2011-12-09 Thread Peter Geoghegan
On 9 December 2011 18:46, Rob Sargent robjsarg...@gmail.com wrote:
 I would like the file command to tell me something other than data,
 yes even though I can tell by the name (and the directory of course).
 Hoping someone has something I can slip into /usr/share/misc/magic.mgc
 or that directory.

You mean something like this?:

/*
 * Each page of XLOG file has a header like this:
 */
#define XLOG_PAGE_MAGIC 0xD068  /* can be used as WAL version indicator */

Obviously that isn't stable.

-- 
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


  1   2   3   >