Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: [HACKERS] Should array_length() Return NULL)

2013-04-07 Thread Tom Lane
Brendan Jurd  writes:
> On 7 April 2013 01:43, Kevin Grittner  wrote:
>> Your interpretation matches mine all around.  It is unfortunate
>> that we have hijacked the standard's syntax for arrays to add a
>> matrix feature.

> It really is unfortunate.  I wonder if it was done in an attempt to
> mimic Oracle behaviour.

Hardly likely.  That code goes back to Berkeley days (PostQUEL) ---
there is clear ancestry from the array code in Postgres v4r2 released
June 1994.  It's more or less a coincidence that it matches the SQL spec
at all, and I'd be astonished if it matched Oracle particularly closely.

> On the specific issue of CARDINALITY, I guess we need to decide
> whether we are going to pretend that our array/matrix thing is
> actually nested.  I first argued that we should not.   But it occurred
> to me that if we do pretend, it would at least leave the door ajar if
> we want to do something to make our arrays more nest-like in future,
> without disrupting the behaviour of CARDINALITY.

This seems to be exactly the same uncertainty that we couldn't resolve
back in the 8.4 devel cycle, for exactly the same reasons.  I don't see
that the discussion has moved forward any :-(

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: index support for regexp search

2013-04-07 Thread Tom Lane
Alexander Korotkov  writes:
> [ trgm-regexp-0.15.patch.gz ]

I spent the weekend hacking on this, making a number of bug fixes and a
whole lot of cosmetic changes.  I think there are large parts of this
that are in committable shape now, but I still find the actual graph
transformation logic to be mostly unintelligible.  I think what's most
obscure is the distinction between the arcs list and the keys list of
each state in the expanded graph.  I get the impression that the
general idea is for the arcs to represent exactly-known transitions
while the keys represent imprecisely-known transitions ... but there
seems to be at least some leakage between those categories.  Could
you write down a specification for what's supposed to be happening
there?

regards, tom lane



trgm-regexp-0.16.patch.gz
Description: trgm-regexp-0.16.patch.gz

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: [HACKERS] Should array_length() Return NULL)

2013-04-07 Thread Brendan Jurd
On 7 April 2013 01:43, Kevin Grittner  wrote:
> Brendan Jurd  wrote:
>> Indeed it does not prohibit nesting arrays inside other arrays, but
>> the multidim arrays that Postgres allows you to create are not the
>> same thing as nested arrays.
>>
> Your interpretation matches mine all around.  It is unfortunate
> that we have hijacked the standard's syntax for arrays to add a
> matrix feature.

It really is unfortunate.  I wonder if it was done in an attempt to
mimic Oracle behaviour.

> It seems to leave us without any way forward on
> these issues that I like very much.

On the specific issue of CARDINALITY, I guess we need to decide
whether we are going to pretend that our array/matrix thing is
actually nested.  I first argued that we should not.   But it occurred
to me that if we do pretend, it would at least leave the door ajar if
we want to do something to make our arrays more nest-like in future,
without disrupting the behaviour of CARDINALITY.

It is unlikely that we ever would make such a change, but given the
intensity of the opposition to any kind of SQL-level behavioural
changes we've had on this thread, I don't want to create any more
barriers to future efforts to comport with the spec.

So how about:

* we add CARDINALITY, make it work like array_length(a, 1) except that
it returns zero for empty arrays, and
* we add array_num_items, which exposes the internal ArrayGetNItems,
and returns zero for empty arrays.

As in:

CARDINALITY(ARRAY[[1,2], [3,4], [5,6]]) => 3
array_num_items(ARRAY[[1,2], [3,4], [5,6]]) => 6

Cheers,
BJ


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fwd: Range types (DATERANGE, TSTZRANGE) in a foreign key with "inclusion" logic

2013-04-07 Thread Misa Simic
Hi,

Far as I am aware - there is not yet described FK feature...

But should be possible to ensure that rule via trigger.. 1 after update on
container, and one after insert/update on item...


Kind Regards,

Misa

On Saturday, April 6, 2013, Matthias Nagel wrote:

> Hello,
> this is a re-post from the SQL user list 2 month ago, because I assume
> only a developer can answer the questions below.
> Thanks, Matthias Nagel
>
>
> --  Weitergeleitete Nachricht  --
>
> Betreff: Range types (DATERANGE, TSTZRANGE) in a foreign key with
> "inclusion" logic
> Datum: Mittwoch 23 Januar 2013, 11:28:10
> Von: Matthias Nagel >
> An: pgsql-...@postgresql.org 
>
> Hello everybody,
>
> first a big thank you to all that make the range types possible. They are
> great, especially if one runs a database to manage a student's university
> dormitory with a lot of temporal information like rental agreements, room
> allocations, etc. At the moment we are redesigning our database scheme for
> PosgreSQL 9.2, because the new range types and especially the "EXCLUSION"
> constraints allow to put a lot more (business) logic into the database
> scheme than before.
>
> But there is one feature missing (or I am too stupid to find it).
>
> Let's say we have some kind of container with a lifetime attribute, i.e.
> something like that
>
> CREATE TABLE container (
>   id SERIAL PRIMARY KEY,
>   lifetime DATERANGE
> );
>
> Further, there are items that must be part of the container and these
> items have a lifetime, too.
>
> CREATE TABLE item (
>   id SERIAL PRIMARY KEY,
>   container_id INTEGER,
>   lifetime DATERANGE,
>   FOREIGN KEY (container_id) REFERENCES container ( id ),
>   EXCLUDE USING gist ( container_id WITH =, lifetime WITH && )
> );
>
> The foreign key ensures that items are only put into containers that
> really exist and the exclude constraint ensure that only one item is member
> of the same container at any point of time.
>
> But actually I need a little bit more logic. The additional contraint is
> that items must only be put into those containers whose lifetime covers the
> lifetime of the item. If an item has a lifetime that exceeds the lifetime
> of the container, the item cannot be put into that container. If an item is
> already in a container (with valid lifetimes) and later the container or
> the item is updated such that either lifetime is modified and the contraint
> is not fullfilled any more, this update must fail.
>
> I would like to do someting like:
>
> FOREIGN KEY ( container_id, lifetime ) REFERENCES other_table ( id,
> lifetime ) USING gist ( container_id WITH =, lifetime WITH <@ )
>
> (Of course, this is PosgreSQL-pseudo-code, but it hopefully make clear
> what I want.)
>
> So, now my questions:
>
> 1) Does this kind of feature already exist in 9.2? If yes, a link to the
> documentation would be helpful.
>
> 2) If this feature does not directly exist, has anybody a good idea how to
> mimic the intended behaviour?
>
> 3) If neither 1) or 2) applies, are there any plans to integrate such a
> feature? I found this discussion
> http://www.postgresql.org/message-id/4f8bb9b0.5090...@darrenduncan.net .
> Does anybody know about the progress?
>
> Having range types and exclusion contraints are nice, as I said in the
> introdruction. But if the reverse (foreign key with inclusion) would also
> work, the range type feature would really be amazing.
>
>
> Best regards, Matthias Nagel
>
>
>
> --
> Matthias Nagel
> Willy-Andreas-Allee 1, Zimmer 506
> 76131 Karlsruhe
>
> Telefon: +49-721-8695-1506
> Mobil: +49-151-15998774
> e-Mail: matthias.h.na...@gmail.com 
> ICQ: 499797758
> Skype: nagmat84
>
> -
> --
> Matthias Nagel
> Willy-Andreas-Allee 1, Zimmer 506
> 76131 Karlsruhe
>
> Telefon: +49-721-8695-1506
> Mobil: +49-151-15998774
> e-Mail: matthias.h.na...@gmail.com 
> ICQ: 499797758
> Skype: nagmat84
>
>
>
> --
> Sent via pgsql-hackers mailing list 
> (pgsql-hackers@postgresql.org
> )
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>


[HACKERS] how to pass data (tuples) to worker processes?

2013-04-07 Thread Tomas Vondra
Hi,

I'm learning how to use the "background worker processes" commited in
9.3. The usage basics are quite nicely illustrated in the worker_spi
extension (kudos to those who designed the feature / extension).

I'm not quite sure how to pass data between the regular backend and a
worker. Implementing the channel (socket/pipe/...) itself is not a big
deal, that's IPC 101, but deciding which data to copy (and how) is.

Say I need to forward a tuple to the worker process - e.g. from a
nodeAgg node, so that the worker can build the hash table. Is there
something (a rule of a thumb, method, ...) that would help me to
identify the pieces of data that need to be copied?

Or do I need to do the go through the objects and decide what to copy
and how on my own?

regards
Tomas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PROPOSAL: tracking aggregated numbers from pg_stat_database

2013-04-07 Thread Tomas Vondra
On 6.4.2013 21:51, Tomas Vondra wrote:
> Hi,
> 
> I'm regularly using pg_stat_database view to analyze various aspects of
> behavior of the cluster. The #1 issue I'm constantly running into is
> that to get cluster-level view (across all the databases), the table
> needs to be aggregated like this:
> 
> SELECT
> SUM(blks_hit) blks_hit,
> SUM(blks_read) blks_read
> FROM pg_stat_database
> 
> This more or less works in stable environments, but once you start
> dropping databases (think of hosting with shared DB server) it gets
> unusable because after DROP DATABASE the database suddenly disappears
> from the sum.
> 
> Therefore I do propose tracking the aggregated stats, similar to the
> pg_stat_bgwriter view. This does not require new messages (thanks to
> reuse of the existing messages), and I expect the overhead to be
> negligible (a few bytes of storage, minimal CPU).
> 
> I think it does not make sense to merge this into pg_stat_bgwriter,
> creating a new view (can't think of a good name though), seems like a
> much better choice to me.

Attached is a first version of the patch, just to illustrate the idea.
It creates a new system view pg_stat_agg_database with aggregated data.
There are no docs, no regression tests etc.

Now, I'm thinking if we should do something similar with database object
(table/index) stats, i.e. keep maintain aggregated data. This might seem
silly at first, but consider for example a partitioned table. It's
common to have a db-level metrics on idx/seq_scans, but when you drop a
partition (e.g. the oldest one), you may get into the same trouble as
with database stats (see my previous post).

So I think it would be handy to define table/index stats aggregated at
the db-level, i.e. there would be one row for each database. I don't
think it makes much sense to aggregate vacuum/analyze info (counts and
timestamps), which means 18 BIGINT counters from pg_stat[io]_tables, and
10 BIGINT counters from pg_stat[io]_indexes. That's 224B for each
database, which I believe is negligible.

OTOH it would be very handy to have this info aggretated per-schema and
per-tablespace, but I'd say to do that later in a separate patch.

Opinions?

Tomas
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 57adbf6..28f3ca2 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -636,6 +636,27 @@ CREATE VIEW pg_stat_database AS
 pg_stat_get_db_stat_reset_time(D.oid) AS stats_reset
 FROM pg_database D;
 
+CREATE VIEW pg_stat_agg_database AS
+SELECT
+pg_stat_get_all_numbackends() AS numbackends,
+pg_stat_get_all_xact_commit() AS xact_commit,
+pg_stat_get_all_xact_rollback() AS xact_rollback,
+pg_stat_get_all_blocks_fetched() -
+pg_stat_get_all_blocks_hit() AS blks_read,
+pg_stat_get_all_blocks_hit() AS blks_hit,
+pg_stat_get_all_tuples_returned() AS tup_returned,
+pg_stat_get_all_tuples_fetched() AS tup_fetched,
+pg_stat_get_all_tuples_inserted() AS tup_inserted,
+pg_stat_get_all_tuples_updated() AS tup_updated,
+pg_stat_get_all_tuples_deleted() AS tup_deleted,
+pg_stat_get_all_conflict_all() AS conflicts,
+pg_stat_get_all_temp_files() AS temp_files,
+pg_stat_get_all_temp_bytes() AS temp_bytes,
+pg_stat_get_all_deadlocks() AS deadlocks,
+pg_stat_get_all_blk_read_time() AS blk_read_time,
+pg_stat_get_all_blk_write_time() AS blk_write_time,
+pg_stat_get_bgwriter_stat_reset_time() AS stats_reset;
+
 CREATE VIEW pg_stat_database_conflicts AS
 SELECT
 D.oid AS datid,
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index 29d986a..121794d 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -4478,6 +4478,12 @@ pgstat_recv_tabstat(PgStat_MsgTabstat *msg, int len)
 	int			i;
 	bool		found;
 
+	/* update aggregated data first */
+	globalStats.n_xact_commit += (PgStat_Counter) (msg->m_xact_commit);
+	globalStats.n_xact_rollback += (PgStat_Counter) (msg->m_xact_rollback);
+	globalStats.n_block_read_time += msg->m_block_read_time;
+	globalStats.n_block_write_time += msg->m_block_write_time;
+
 	dbentry = pgstat_get_db_entry(msg->m_databaseid, true);
 
 	/*
@@ -4561,6 +4567,15 @@ pgstat_recv_tabstat(PgStat_MsgTabstat *msg, int len)
 		dbentry->n_tuples_deleted += tabmsg->t_counts.t_tuples_deleted;
 		dbentry->n_blocks_fetched += tabmsg->t_counts.t_blocks_fetched;
 		dbentry->n_blocks_hit += tabmsg->t_counts.t_blocks_hit;
+
+		/* update the aggregated counters too */
+		globalStats.n_blocks_fetched += tabmsg->t_counts.t_blocks_fetched;
+		globalStats.n_blocks_hit += tabmsg->t_counts.t_blocks_hit;
+		globalStats.n_tuples_returned += tabmsg->t_counts.t_tuples_returned;
+		globalStats.n_tuples_fetched += 

[HACKERS] Current syslogger filename

2013-04-07 Thread Kerem Kat
Hello,

While going through the TODO items on the wiki I have come across the
following proposal:

Provide a way to query the log collector subprocess to determine the
name of the currently active log file

  Current log files when rotating?
  http://archives.postgresql.org/pgsql-general/2008-11/msg00418.php

Upon reading the related conversations, there doesn't seem to be a
consensus on how to implement this feature. If it is still relevant,
would you suggest a way to pass filename or time(which filename is
generated from) of the file from syslogger to inquiring postgres
instance.


Regards,

Kerem


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Slightly insane use of USE_WIDE_UPPER_LOWER in pg_trgm

2013-04-07 Thread Tom Lane
Alexander Korotkov  writes:
> On Sun, Apr 7, 2013 at 10:00 PM, Tom Lane  wrote:
>> Hm, actually couldn't we just simplify to "if (bytelen != charlen)"?

> I think yes, we can :)

OK.  I pushed this as a separate commit so as to highlight the potential
incompatibility in the commit log.  I am not sure it's even worth a
release note though ...

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Slightly insane use of USE_WIDE_UPPER_LOWER in pg_trgm

2013-04-07 Thread Alexander Korotkov
On Sun, Apr 7, 2013 at 10:00 PM, Tom Lane  wrote:

> Alexander Korotkov  writes:
> > It's also likely we can change
> >if (pg_database_encoding_max_length() > 1)
> > into something like
> >if (pg_database_encoding_max_length() > 1 && bytelen != charlen)
>
> Hm, actually couldn't we just simplify to "if (bytelen != charlen)"?
>

I think yes, we can :)

--
With best regards,
Alexander Korotkov.


Re: [HACKERS] Slightly insane use of USE_WIDE_UPPER_LOWER in pg_trgm

2013-04-07 Thread Tom Lane
Alexander Korotkov  writes:
> It's also likely we can change
>if (pg_database_encoding_max_length() > 1)
> into something like
>if (pg_database_encoding_max_length() > 1 && bytelen != charlen)

Hm, actually couldn't we just simplify to "if (bytelen != charlen)"?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Slightly insane use of USE_WIDE_UPPER_LOWER in pg_trgm

2013-04-07 Thread Alexander Korotkov
On Sun, Apr 7, 2013 at 7:43 PM, Tom Lane  wrote:

> While reviewing the latest incarnation of the regex indexing patch,
> I noticed that make_trigrams() in contrib/pg_trgm/trgm_op.c is coded
> so that if USE_WIDE_UPPER_LOWER is not set, it ignores multibyte
> character boundaries and just makes trigrams from 3-byte substrings.
> This seems slightly insane, not least because there's an Assert there
> that will fail if it's fed any multibyte characters.  I suppose no one
> has actually tried this code with non-ASCII data on machines where
> USE_WIDE_UPPER_LOWER isn't set; at least not with Asserts turned on.
> (Considering that even my favorite dinosaur HPUX machine has got both
> HAVE_WCSTOMBS and HAVE_TOWLOWER, it may well be that there *aren't* any
> such machines anymore.)
>
> So I'm inclined to remove the two #ifdef USE_WIDE_UPPER_LOWER tests
> in trgm_op.c, and just use the multibyte-aware code all the time.
> A downside of this is that if there is indeed anyone out there storing
> non-ASCII trigrams on a machine without USE_WIDE_UPPER_LOWER, their
> indexes would break if they pg_upgrade to 9.3.  OTOH their indexes would
> break anyway if they rebuilt against a more modern libc, or built with
> Asserts on.
>
> If we don't do this then we'll have to complicate the regex indexing
> patch some more, since it's currently imagining that cnt_trigram()
> is always the way to make storable trigrams from raw text, and this
> is just wrong in the existing non-USE_WIDE_UPPER_LOWER code path
>

+1 for removing #ifdef USE_WIDE_UPPER_LOWER tests. Even if it works
somewhere with non-ASCII data without USE_WIDE_UPPER_LOWER then anyway it's
a buggy logic with invalid results.

It's also likely we can change
   if (pg_database_encoding_max_length() > 1)
into something like
   if (pg_database_encoding_max_length() > 1 && bytelen != charlen)

--
With best regards,
Alexander Korotkov.


Re: [HACKERS] Process title for autovac

2013-04-07 Thread Alvaro Herrera
Jeff Janes escribió:

> Is this functionality something we want?  If so should it include explicit
> vacuum as well as autovac?

Yes.  No.

> Any opinion about where in the code base it
> properly belongs (which obviously depends on whether it should cover manual
> vacuum as well)?  And does the string need to distinguish between an
> autovac and an autoanalyze?

autovacuum_do_vac_analyze() is probably the place to add it.  I think we
should include the wraparound, dovacuum and doanalyze flags in there
somehow, yes.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Slightly insane use of USE_WIDE_UPPER_LOWER in pg_trgm

2013-04-07 Thread Tom Lane
While reviewing the latest incarnation of the regex indexing patch,
I noticed that make_trigrams() in contrib/pg_trgm/trgm_op.c is coded
so that if USE_WIDE_UPPER_LOWER is not set, it ignores multibyte
character boundaries and just makes trigrams from 3-byte substrings.
This seems slightly insane, not least because there's an Assert there
that will fail if it's fed any multibyte characters.  I suppose no one
has actually tried this code with non-ASCII data on machines where
USE_WIDE_UPPER_LOWER isn't set; at least not with Asserts turned on.
(Considering that even my favorite dinosaur HPUX machine has got both
HAVE_WCSTOMBS and HAVE_TOWLOWER, it may well be that there *aren't* any
such machines anymore.)

So I'm inclined to remove the two #ifdef USE_WIDE_UPPER_LOWER tests
in trgm_op.c, and just use the multibyte-aware code all the time.
A downside of this is that if there is indeed anyone out there storing
non-ASCII trigrams on a machine without USE_WIDE_UPPER_LOWER, their
indexes would break if they pg_upgrade to 9.3.  OTOH their indexes would
break anyway if they rebuilt against a more modern libc, or built with
Asserts on.

If we don't do this then we'll have to complicate the regex indexing
patch some more, since it's currently imagining that cnt_trigram()
is always the way to make storable trigrams from raw text, and this
is just wrong in the existing non-USE_WIDE_UPPER_LOWER code path.

Comments?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] doc hdparm also support SATA

2013-04-07 Thread Craig Ringer
On 04/07/2013 02:29 PM, Jov wrote:
> from the pg doc:
> http://www.postgresql.org/docs/devel/static/wal-reliability.html :On Linux,
> IDE drives can be queried using hdparm -I; write caching is enabled if
> there is a * next to Write cache. hdparm -W 0 can be used to turn off
> write caching. SCSI drives can be queried using sdparm
> . Use sdparm --get=WCE to check
> whether the write cache is enabled and sdparm --clear=WCE to disable it.
The previous paragraph does say "Consumer-grade IDE and SATA drives are
particularly likely
   to have write-back caches that will not survive a power failure."

I guess it's reasonable to replace "IDE" with "ATA/SATA" though it's
pretty minor. It'd be nice to confirm that the method for disabling the
write cache on fbsd is still correct and applies to SATA drives.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: [HACKERS] corrupt pages detected by enabling checksums

2013-04-07 Thread Jaime Casanova
On Sat, Apr 6, 2013 at 1:36 PM, Jeff Janes  wrote:
> On Fri, Apr 5, 2013 at 6:09 AM, Andres Freund 
> wrote:
>
>>
>> How does the attached version look? I verified that it survives
>> recovery, but not more.
>>
>> Jeff, any chance you can run this for a round with your suite?
>
>
>
> I've run it for a while now and have found no problems.
>

fwiw, i have run installcheck (serial and parallel) and isolationtest,
also combination of those (one installcheck, one isolationtest) at the
same time while executing vacuum full, reindex database and manual
checkpoint...

i also check that the bgwriter was doing some work.

i did all of this in a master node in a cluster with a standby and a
cascade standby that were later promoted... and i have no problem with
checksums at all, so i would say that the combination of Jeff's and
Andres' patches fixed the problems we have seen until now

--
Jaime Casanova www.2ndQuadrant.com
Professional PostgreSQL: Soporte 24x7 y capacitación
Phone: +593 4 5107566 Cell: +593 987171157


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers