Re: [HACKERS] PoC: Partial sort

2014-02-19 Thread Alexander Korotkov
On Thu, Feb 13, 2014 at 1:54 AM, Marti Raudsepp  wrote:

> I think the 1st patch now has a bug in initial_cost_mergejoin; you
> still pass the "presorted_keys" argument to cost_sort, making it
> calculate a partial sort cost, but generated plans never use partial
> sort. I think 0 should be passed instead. Patch attached, needs to be
> applied on top of partial-sort-basic-1 and then reverse-applied on
> partial-sort-merge-1.
>

It doesn't look so for me. Merge join doesn't find partial sort especially.
But if path with some presorted pathkeys will be accidentally selected then
partial sort will be used. See create_mergejoin_plan function. So, I think
this cost_sort call is relevant to create_mergejoin_plan. If we don't want
partial sort to be used in such rare cases then we should revert it from
both places. However, I doubt that it does any overhead, so we can leave it
as is.

--
With best regards,
Alexander Korotkov.


[HACKERS] Selecting large tables gets killed

2014-02-19 Thread Ashutosh Bapat
Hi All,
Here is a strange behaviour with master branch with head at

commit d3c4c471553265e7517be24bae64b81967f6df40
Author: Peter Eisentraut 
Date:   Mon Feb 10 21:47:19 2014 -0500

The OS is
[ashutosh@ubuntu repro]uname -a
Linux ubuntu 3.2.0-59-generic #90-Ubuntu SMP Tue Jan 7 22:43:51 UTC 2014
x86_64 x86_64 x86_64 GNU/Linux

This is a VM hosted on Mac-OS X 10.7.5

Here's the SQL script

[ashutosh@ubuntu repro]cat big_select_killed.sql
drop table big_tab;
create table big_tab (val int, val2 int, str varchar);

insert into big_tab select x, x, lpad('string', 100, x::text)
from generate_series(1, 1000) x;

select * from big_tab;

The last select causes the "Killed" message.

[ashutosh@ubuntu repro]psql -d postgres -f big_select_killed.sql
DROP TABLE
CREATE TABLE
INSERT 0 1000
Killed

There is a message in server log
FATAL:  connection to client lost
STATEMENT:  select * from big_tab;

Any SELECT selecting all the rows is getting psql killed but not SELECT
count(*)

[ashutosh@ubuntu repro]psql -d postgres
psql (9.4devel)
Type "help" for help.

postgres=# select count(*) from big_tab;
  count
--
 1000
(1 row)

postgres=# select * from big_tab;
Killed

[ashutosh@ubuntu repro]psql -d postgres
psql (9.4devel)
Type "help" for help.

Below is the buffer cache size and the relation size (if anyone cares)
postgres=# show shared_buffers;
 shared_buffers

 128MB
(1 row)

postgres=# select pg_relation_size('big_tab'::regclass);
 pg_relation_size
--
   1412415488
(1 row)

postgres=# select pg_relation_size('big_tab'::regclass)/1024/1024; -- IN
MBs to be simple
 ?column?
--
 1346
(1 row)

There are no changes in default configuration. Using unix sockets
[ashutosh@ubuntu repro]ls /tmp/.s.PGSQL.5432*
/tmp/.s.PGSQL.5432  /tmp/.s.PGSQL.5432.lock

Looks like a bug in psql to me. Does anybody see that behaviour?

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


Re: [HACKERS] inherit support for foreign tables

2014-02-19 Thread Kyotaro HORIGUCHI
Hi,

At Wed, 19 Feb 2014 16:17:05 +0900, Shigeru Hanada wrote
> 2014-02-18 19:29 GMT+09:00 Kyotaro HORIGUCHI 
> :
> > Could you guess any use cases in which we are happy with ALTER
> > TABLE's inheritance tree walking?  IMHO, ALTER FOREIGN TABLE
> > always comes with some changes of the data source so implicitly
> > invoking of such commands should be defaultly turned off.
> 
> Imagine a case that foreign data source have attributes (A, B, C, D)
> but foreign tables and their parent ware defined as (A, B, C).  If
> user wants to use D as well, ALTER TABLE parent ADD COLUMN D type
> would be useful (rather necessary?) to keep consistency.

Hmm. I seems to me an issue of mis-configuration at first
step. However, my anxiety is - as in my message just before -
ALTER'ing foreign table definitions without any notice to
operatos and irregular or random logic on check applicability(?)
of ALTER actions.

> Changing data type from compatible one (i.e., int to numeric,
> varchar(n) to text), adding CHECK/NOT NULL constraint would be also
> possible.

I see, thank you. Changing data types are surely valuable but
also seems difficult to check validity:(

Anyway, I gave a second thought on this issue. Please have a look
on that.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] inherit support for foreign tables

2014-02-19 Thread Kyotaro HORIGUCHI
Hello, 

> It is just my personal opinion, but I think it would be convenient for
> users to alter inheritance trees that contain foreign tables the same
> way as inheritance trees that don't contain any foreign tables,
> without making user conscious of the inheritance trees contains
> foreign tables or not.  Imagine we have an inheritance tree that
> contains only plain tables and then add a foreign table as a child of
> the inheritance tree.  Without the flexiblity, we would need to change
> the way of altering the structure of the inheritance tree (e.g., ADD
> CONSTRAINT) to a totally different one, immediately when adding the
> foreign table.  I don't think that would be easy to use.

I personally don't see significant advantages such a
flexibility. Although my concerns here are only two points,
unanticipated application and "maintenancibility".  I gave a
consideration on these issues again.

Then, I think it could be enough by giving feedback to operators
for the first issue.

=# ALTER TABLE parent ADD CHECK (tempmin < tempmax),
  ALTER tempmin SET NOT NULL, 
  ALTER tempmin SET DEFAULT 0;
NOTICE: Child foregn table child01 is affected.
NOTICE: Child foregn table child02 is affected
NOTICE: Child foregn table child03 rejected 'alter tempmin set default'

What do you think about this? It looks a bit too loud for me
though...

Then the second issue, however I don't have enough idea of how
ALTER TABLE works, the complexity would be reduced if acceptance
chek for alter "action"s would done on foreign server or data
wrapper side, not on the core of ALTER TABLE. It would also be a
help to output error messages like above. However, (NO)INHERIT
looks a little different..


> > http://www.postgresql.org/docs/9.3/static/sql-alterforeigntable.html
> >
> >> Consistency with the foreign server is not checked when a
> >> column is added or removed with ADD COLUMN or DROP COLUMN, a
> >> NOT NULL constraint is added, or a column type is changed with
> >> SET DATA TYPE. It is the user's responsibility to ensure that
> >> the table definition matches the remote side.
> >
> > So I belive implicit and automatic application of any constraint
> > on foreign childs are considerably danger.
> 
> We spent a lot of time discussing this issue, and the consensus is
> that it's users' fault if there are some tuples on the remote side
> violating a given constraint, as mentioned in the documentation.

I'm worried about not that but the changes and possible
inconsistency would take place behind operators' backs. And this
looks to cause such inconsistencies for me.

> >> * [NO] INHERIT parent_table
> >
> > Is this usable for inheritance foreign children? NO INHERIT
> > removes all foreign children but INHERIT is nop.
> 
> I didn't express clearly.  Sorry for that.  Let me explain about it.
> 
> * ALTER FOREIGN TABLE target_table *INHERIT* parent_table: Add the
> * target table as a new child of the parent table.
> * ALTER FOREIGN TABLE target_table *NO INHERIT* parent_table: Remove the
> * target table from the list of children of the parent table.

I got it, thank you. It alone seems no probmen but also doesn't
seem to be a matter of 'ALTER TABLE'. Could you tell me how this
is related to 'ALTER TABLE'?

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Priority table or Cache table

2014-02-19 Thread Haribabu Kommi
On Thu, Feb 20, 2014 at 2:26 PM, Amit Kapila wrote:

> On Thu, Feb 20, 2014 at 6:24 AM, Haribabu Kommi
>  wrote:
> > On Thu, Feb 20, 2014 at 11:38 AM, Tom Lane  wrote:
> >> > I want to propose a new feature called "priority table" or "cache
> >> > table".
> >> > This is same as regular table except the pages of these tables are
> >> > having
> >> > high priority than normal tables. These tables are very useful, where
> a
> >> > faster query processing on some particular tables is expected.
> >>
> >> Why exactly does the existing LRU behavior of shared buffers not do
> >> what you need?
> >
> >
> > Lets assume a database having 3 tables, which are accessed regularly. The
> > user is expecting a faster query results on one table.
> > Because of LRU behavior which is not happening some times.
>
> I think this will not be a problem for regularly accessed tables(pages),
> as per current algorithm they will get more priority before getting
> flushed out of shared buffer cache.
> Have you come across any such case where regularly accessed pages
> get lower priority than non-regularly accessed pages?
>

Because of other regularly accessed tables, some times the table which
expects faster results is getting delayed.


> However it might be required for cases where user wants to control
> such behaviour and pass such hints through table level option or some
> other way to indicate that he wants more priority for certain tables
> irrespective
> of their usage w.r.t other tables.
>
> Now I think here important thing to find out is how much helpful it is for
> users or why do they want to control such behaviour even when Database
> already takes care of such thing based on access pattern.
>

Yes it is useful in cases where the application always expects the faster
results whether the table is used regularly or not.

Regards,
Hari Babu
Fujitsu Australia


[HACKERS] Re: BUG #9210: PostgreSQL string store bug? not enforce check with correct characterSET/encoding

2014-02-19 Thread Noah Misch
On Wed, Feb 19, 2014 at 08:22:13PM -0500, Tom Lane wrote:
> The more I looked into mbutils.c, the less happy I got.  The attached
> proposed patch takes care of the missing-verification hole in
> pg_do_encoding_conversion() and pg_server_to_any(), and also gets rid
> of what I believe to be obsolete provisions in pg_do_encoding_conversion
> to "work" if called outside a transaction --- if you consider it working
> to completely fail to honor its API contract.  That should no longer be
> necessary now that we perform client<->server encoding conversions via
> perform_default_encoding_conversion rather than here.

I like these changes.  In particular, coping with the absence of a conversion
function by calling ereport(LOG) and returning the source string was wrong for
nearly every caller, but you'd need to try an encoding like MULE_INTERNAL to
notice the problem.  Good riddance.

> How much of this is back-patch material, do you think?

None of it.  While many of the failures to validate against a character
encoding are clear bugs, applications hum along in spite of such bugs and
break when we tighten the checks.  I don't see a concern to override that
here.  Folks who want the tighter checking have some workarounds available.

Thanks,
nm

-- 
Noah Misch
EnterpriseDB http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Priority table or Cache table

2014-02-19 Thread Amit Kapila
On Thu, Feb 20, 2014 at 6:24 AM, Haribabu Kommi
 wrote:
> On Thu, Feb 20, 2014 at 11:38 AM, Tom Lane  wrote:
>> > I want to propose a new feature called "priority table" or "cache
>> > table".
>> > This is same as regular table except the pages of these tables are
>> > having
>> > high priority than normal tables. These tables are very useful, where a
>> > faster query processing on some particular tables is expected.
>>
>> Why exactly does the existing LRU behavior of shared buffers not do
>> what you need?
>
>
> Lets assume a database having 3 tables, which are accessed regularly. The
> user is expecting a faster query results on one table.
> Because of LRU behavior which is not happening some times.

I think this will not be a problem for regularly accessed tables(pages),
as per current algorithm they will get more priority before getting
flushed out of shared buffer cache.
Have you come across any such case where regularly accessed pages
get lower priority than non-regularly accessed pages?

However it might be required for cases where user wants to control
such behaviour and pass such hints through table level option or some
other way to indicate that he wants more priority for certain tables
irrespective
of their usage w.r.t other tables.

Now I think here important thing to find out is how much helpful it is for
users or why do they want to control such behaviour even when Database
already takes care of such thing based on access pattern.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Add a GUC to report whether data page checksums are enabled.

2014-02-19 Thread Peter Geoghegan
On Wed, Feb 19, 2014 at 4:49 PM, Michael Paquier
 wrote:
>> +1 for back-patching.
> Back-patching would be interesting for existing applications, but -1
> as it is a new feature :)

I think that it rises to the level of an omission in 9.3 that now
requires correction. Many of our users couldn't run pg_controldata
even if they'd heard of it...


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] narwhal and PGDLLIMPORT

2014-02-19 Thread Tom Lane
Hiroshi Inoue  writes:
> Seems EXPORTS line in exports.txt should be removed.

Done.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] narwhal and PGDLLIMPORT

2014-02-19 Thread Hiroshi Inoue
(2014/02/20 10:32), Tom Lane wrote:
> Hiroshi Inoue  writes:
>> Attached is a patch to remove dllwarp from pgevent/Makefile.
> 
> Actually, it looks like this patch doesn't work at all:

Strangely enough it works here though I see double EXPORTS lines
in libpgeventdll.def.

> http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=jacana&dt=2014-02-20%2001%3A00%3A53
> 
> Did I fat-finger the commit somehow?  I made a couple of cosmetic
> changes, or at least I thought they were cosmetic.

Seems EXPORTS line in exports.txt should be removed.

regards,
Hiroshi Inoue




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] narwhal and PGDLLIMPORT

2014-02-19 Thread Alvaro Herrera
Tom Lane escribió:
> Hiroshi Inoue  writes:
> > Attached is a patch to remove dllwarp from pgevent/Makefile.
> 
> Actually, it looks like this patch doesn't work at all:
> 
> http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=jacana&dt=2014-02-20%2001%3A00%3A53
> 
> Did I fat-finger the commit somehow?  I made a couple of cosmetic
> changes, or at least I thought they were cosmetic.

The format of the file is completely unlike that of the other *dll.def
files we use elsewhere.  AFAICT each line is

symbolname@some_number

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] narwhal and PGDLLIMPORT

2014-02-19 Thread Tom Lane
Hiroshi Inoue  writes:
> Attached is a patch to remove dllwarp from pgevent/Makefile.

Actually, it looks like this patch doesn't work at all:

http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=jacana&dt=2014-02-20%2001%3A00%3A53

Did I fat-finger the commit somehow?  I made a couple of cosmetic
changes, or at least I thought they were cosmetic.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] should we add a XLogRecPtr/LSN SQL type?

2014-02-19 Thread Michael Paquier
On Thu, Feb 20, 2014 at 8:55 AM, Andres Freund  wrote:
> On 2014-02-19 12:47:40 -0500, Robert Haas wrote:
>> On Wed, Feb 5, 2014 at 9:26 PM, Michael Paquier
>>  wrote:
>> >> Yes, that's a good precedent in multiple ways.
>> > Here are updated patches to use pg_lsn instead of pglsn...
>>
>> OK, so I think this stuff is all committed now, with assorted changes.
>>  Thanks for your work on this.
>
> cool, thanks you two.
>
> I wonder if pg_stat_replication shouldn't be updated to use it as well?
> SELECT * FROM pg_attribute WHERE attname ~ '(location|lsn)'; only shows
> that as names that are possible candidates for conversion.
I was sure to have forgotten some views or functions in the previous
patch... Please find attached a patch making pg_stat_replication use
pg_lsn instead of the existing text fields.
Regards,
-- 
Michael
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index a37e6b6..370857a 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -1490,24 +1490,24 @@ postgres: user database host 
 
  sent_location
- text
+ pg_lsn
  Last transaction log position sent on this connection
 
 
  write_location
- text
+ pg_lsn
  Last transaction log position written to disk by this standby
   server
 
 
  flush_location
- text
+ pg_lsn
  Last transaction log position flushed to disk by this standby
   server
 
 
  replay_location
- text
+ pg_lsn
  Last transaction log position replayed into the database on this
   standby server
 
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 06b22e2..048367a 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -67,6 +67,7 @@
 #include "utils/builtins.h"
 #include "utils/guc.h"
 #include "utils/memutils.h"
+#include "utils/pg_lsn.h"
 #include "utils/ps_status.h"
 #include "utils/resowner.h"
 #include "utils/timeout.h"
@@ -2137,7 +2138,6 @@ pg_stat_get_wal_senders(PG_FUNCTION_ARGS)
 	{
 		/* use volatile pointer to prevent code rearrangement */
 		volatile WalSnd *walsnd = &WalSndCtl->walsnds[i];
-		char		location[MAXFNAMELEN];
 		XLogRecPtr	sentPtr;
 		XLogRecPtr	write;
 		XLogRecPtr	flush;
@@ -2171,28 +2171,19 @@ pg_stat_get_wal_senders(PG_FUNCTION_ARGS)
 		else
 		{
 			values[1] = CStringGetTextDatum(WalSndGetStateString(state));
-
-			snprintf(location, sizeof(location), "%X/%X",
-	 (uint32) (sentPtr >> 32), (uint32) sentPtr);
-			values[2] = CStringGetTextDatum(location);
+			values[2] = LSNGetDatum(sentPtr);
 
 			if (write == 0)
 nulls[3] = true;
-			snprintf(location, sizeof(location), "%X/%X",
-	 (uint32) (write >> 32), (uint32) write);
-			values[3] = CStringGetTextDatum(location);
+			values[3] = LSNGetDatum(write);
 
 			if (flush == 0)
 nulls[4] = true;
-			snprintf(location, sizeof(location), "%X/%X",
-	 (uint32) (flush >> 32), (uint32) flush);
-			values[4] = CStringGetTextDatum(location);
+			values[4] = LSNGetDatum(flush);
 
 			if (apply == 0)
 nulls[5] = true;
-			snprintf(location, sizeof(location), "%X/%X",
-	 (uint32) (apply >> 32), (uint32) apply);
-			values[5] = CStringGetTextDatum(location);
+			values[5] = LSNGetDatum(apply);
 
 			values[6] = Int32GetDatum(sync_priority[i]);
 
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index 11c1e1a..a2cc19f 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -2634,7 +2634,7 @@ DATA(insert OID = 1936 (  pg_stat_get_backend_idset		PGNSP PGUID 12 1 100 0 0 f
 DESCR("statistics: currently active backend IDs");
 DATA(insert OID = 2022 (  pg_stat_get_activity			PGNSP PGUID 12 1 100 0 0 f f f f f t s 1 0 2249 "23" "{23,26,23,26,25,25,25,16,1184,1184,1184,1184,869,25,23}" "{i,o,o,o,o,o,o,o,o,o,o,o,o,o,o}" "{pid,datid,pid,usesysid,application_name,state,query,waiting,xact_start,query_start,backend_start,state_change,client_addr,client_hostname,client_port}" _null_ pg_stat_get_activity _null_ _null_ _null_ ));
 DESCR("statistics: information about currently active backends");
-DATA(insert OID = 3099 (  pg_stat_get_wal_senders	PGNSP PGUID 12 1 10 0 0 f f f f f t s 0 0 2249 "" "{23,25,25,25,25,25,23,25}" "{o,o,o,o,o,o,o,o}" "{pid,state,sent_location,write_location,flush_location,replay_location,sync_priority,sync_state}" _null_ pg_stat_get_wal_senders _null_ _null_ _null_ ));
+DATA(insert OID = 3099 (  pg_stat_get_wal_senders	PGNSP PGUID 12 1 10 0 0 f f f f f t s 0 0 2249 "" "{23,25,3220,3220,3220,3220,23,25}" "{o,o,o,o,o,o,o,o}" "{pid,state,sent_location,write_location,flush_location,replay_location,sync_priority,sync_state}" _null_ pg_stat_get_wal_senders _null_ _null_ _null_ ));
 DESCR("statistics: information about currently active replication");
 DATA(insert OID = 2026 (  pg_backend_pidPGNSP PGUID 12 1 0 0 0 f f f f t f s 0 0 23 "" _null_ _null_ _null_ _null_ pg_backend_pid _null_ _null_ _

Re: [HACKERS] [BUGS] BUG #9210: PostgreSQL string store bug? not enforce check with correct characterSET/encoding

2014-02-19 Thread Tom Lane
I wrote:
>> The minimum-refactoring solution to this would be to tweak
>> pg_do_encoding_conversion() so that if the src_encoding is SQL_ASCII but
>> the dest_encoding isn't, it does pg_verify_mbstr() rather than nothing.

>> I'm not sure if this would break anything we need to have work,
>> though.  Thoughts?  Do we want to back-patch such a change?

> I looked through all the callers of pg_do_encoding_conversion(), and
> AFAICS this change is a good idea.  There are a whole bunch of places
> that use pg_do_encoding_conversion() to convert from the database encoding
> to encoding X (most usually UTF8), and right now if you do that in a
> SQL_ASCII database you have no assurance whatever that what is produced
> is actually valid in encoding X.  I think we need to close that loophole.

The more I looked into mbutils.c, the less happy I got.  The attached
proposed patch takes care of the missing-verification hole in
pg_do_encoding_conversion() and pg_server_to_any(), and also gets rid
of what I believe to be obsolete provisions in pg_do_encoding_conversion
to "work" if called outside a transaction --- if you consider it working
to completely fail to honor its API contract.  That should no longer be
necessary now that we perform client<->server encoding conversions via
perform_default_encoding_conversion rather than here.

For testing I inserted an "Assert(IsTransactionState());" at the top of
pg_do_encoding_conversion(), and couldn't trigger it, so I'm fairly sure
this change is OK.  Still, back-patching it might not be a good thing.
On the other hand, if there is still any code path that can get here
outside a transaction, we probably ought to find out rather than allow
completely bogus data to be returned.

I also made some more-cosmetic improvements, notably removing a bunch
of Asserts that are certainly dead code because the relevant variables
are never NULL.

I've not done anything yet about simplifying unnecessary calls of
pg_do_encoding_conversion into pg_server_to_any/pg_any_to_server.

How much of this is back-patch material, do you think?

regards, tom lane

diff --git a/src/backend/utils/mb/mbutils.c b/src/backend/utils/mb/mbutils.c
index 08440e9..7f43cae 100644
*** a/src/backend/utils/mb/mbutils.c
--- b/src/backend/utils/mb/mbutils.c
***
*** 1,10 
! /*
!  * This file contains public functions for conversion between
!  * client encoding and server (database) encoding.
   *
!  * Tatsuo Ishii
   *
!  * src/backend/utils/mb/mbutils.c
   */
  #include "postgres.h"
  
--- 1,36 
! /*-
   *
!  * mbutils.c
!  *	  This file contains functions for encoding conversion.
   *
!  * The string-conversion functions in this file share some API quirks.
!  * Note the following:
!  *
!  * The functions return a palloc'd, null-terminated string if conversion
!  * is required.  However, if no conversion is performed, the given source
!  * string pointer is returned as-is.
!  *
!  * Although the presence of a length argument means that callers can pass
!  * non-null-terminated strings, care is required because the same string
!  * will be passed back if no conversion occurs.  Such callers *must* check
!  * whether result == src and handle that case differently.
!  *
!  * If the source and destination encodings are the same, the source string
!  * is returned without any verification; it's assumed to be valid data.
!  * If that might not be the case, the caller is responsible for validating
!  * the string using a separate call to pg_verify_mbstr().  Whenever the
!  * source and destination encodings are different, the functions ensure that
!  * the result is validly encoded according to the destination encoding.
!  *
!  *
!  * Portions Copyright (c) 1996-2014, PostgreSQL Global Development Group
!  * Portions Copyright (c) 1994, Regents of the University of California
!  *
!  *
!  * IDENTIFICATION
!  *	  src/backend/utils/mb/mbutils.c
!  *
!  *-
   */
  #include "postgres.h"
  
*** InitializeClientEncoding(void)
*** 290,296 
  int
  pg_get_client_encoding(void)
  {
- 	Assert(ClientEncoding);
  	return ClientEncoding->encoding;
  }
  
--- 316,321 
*** pg_get_client_encoding(void)
*** 300,328 
  const char *
  pg_get_client_encoding_name(void)
  {
- 	Assert(ClientEncoding);
  	return ClientEncoding->name;
  }
  
  /*
!  * Apply encoding conversion on src and return it. The encoding
!  * conversion function is chosen from the pg_conversion system catalog
!  * marked as "default". If it is not found in the schema search path,
!  * it's taken from pg_catalog schema. If it even is not in the schema,
!  * warn and return src.
!  *
!  * If conversion occurs, a palloc'd null-terminated string is returned.
!  * In the case of no conversion, src is returned.
!  *
!  * CAUTION: although the presence of a length 

Re: [HACKERS] should we add a XLogRecPtr/LSN SQL type?

2014-02-19 Thread Michael Paquier
On Thu, Feb 20, 2014 at 9:43 AM, Michael Paquier
 wrote:
> On Thu, Feb 20, 2014 at 2:47 AM, Robert Haas  wrote:
>> On Wed, Feb 5, 2014 at 9:26 PM, Michael Paquier
>>  wrote:
>>> On Thu, Feb 6, 2014 at 3:48 AM, Peter Eisentraut  wrote:
 On 2/5/14, 1:31 PM, Robert Haas wrote:
> On Tue, Feb 4, 2014 at 3:26 PM, Peter Eisentraut  wrote:
>> Perhaps this type should be called pglsn, since it's an
>> implementation-specific detail and not a universal concept like int,
>> point, or uuid.
>
> If we're going to do that, I suggest pg_lsn rather than pglsn.  We
> already have pg_node_tree, so using underscores for separation would
> be more consistent.

 Yes, that's a good precedent in multiple ways.
>>> Here are updated patches to use pg_lsn instead of pglsn...
>>
>> OK, so I think this stuff is all committed now, with assorted changes.
>>  Thanks for your work on this.
> Thanks!
> Oops, it looks like I am coming after the battle (time difference does
> not help). I'll be more careful to test such patches on 32b platforms
> as well in the future.
After re-reading the code, I found two incorrect comments in the new
regression tests. Patch fixing them is attached.
Thanks,
-- 
Michael
diff --git a/src/backend/utils/adt/pg_lsn.c b/src/backend/utils/adt/pg_lsn.c
index e2b528a..fae12e1 100644
--- a/src/backend/utils/adt/pg_lsn.c
+++ b/src/backend/utils/adt/pg_lsn.c
@@ -19,7 +19,7 @@
 #include "utils/pg_lsn.h"
 
 #define MAXPG_LSNLEN			17
-#define MAXPG_LSNCOMPONENT	8
+#define MAXPG_LSNCOMPONENT		8
 
 /*--
  * Formatting and conversion routines.
diff --git a/src/test/regress/expected/pg_lsn.out b/src/test/regress/expected/pg_lsn.out
index 01d2983..504768c 100644
--- a/src/test/regress/expected/pg_lsn.out
+++ b/src/test/regress/expected/pg_lsn.out
@@ -52,13 +52,13 @@ SELECT '0/16AE7F8' > pg_lsn '0/16AE7F7';
  t
 (1 row)
 
-SELECT '0/16AE7F7'::pg_lsn - '0/16AE7F8'::pg_lsn; -- No negative results
+SELECT '0/16AE7F7'::pg_lsn - '0/16AE7F8'::pg_lsn;
  ?column? 
 --
-1
 (1 row)
 
-SELECT '0/16AE7F8'::pg_lsn - '0/16AE7F7'::pg_lsn; -- correct
+SELECT '0/16AE7F8'::pg_lsn - '0/16AE7F7'::pg_lsn;
  ?column? 
 --
 1
diff --git a/src/test/regress/sql/pg_lsn.sql b/src/test/regress/sql/pg_lsn.sql
index dddafb3..1634d37 100644
--- a/src/test/regress/sql/pg_lsn.sql
+++ b/src/test/regress/sql/pg_lsn.sql
@@ -21,5 +21,5 @@ SELECT '0/16AE7F8' = '0/16AE7F8'::pg_lsn;
 SELECT '0/16AE7F8'::pg_lsn != '0/16AE7F7';
 SELECT '0/16AE7F7' < '0/16AE7F8'::pg_lsn;
 SELECT '0/16AE7F8' > pg_lsn '0/16AE7F7';
-SELECT '0/16AE7F7'::pg_lsn - '0/16AE7F8'::pg_lsn; -- No negative results
-SELECT '0/16AE7F8'::pg_lsn - '0/16AE7F7'::pg_lsn; -- correct
+SELECT '0/16AE7F7'::pg_lsn - '0/16AE7F8'::pg_lsn;
+SELECT '0/16AE7F8'::pg_lsn - '0/16AE7F7'::pg_lsn;

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Priority table or Cache table

2014-02-19 Thread Haribabu Kommi
On Thu, Feb 20, 2014 at 11:38 AM, Tom Lane  wrote:

> Haribabu Kommi  writes:
> > I want to propose a new feature called "priority table" or "cache table".
> > This is same as regular table except the pages of these tables are having
> > high priority than normal tables. These tables are very useful, where a
> > faster query processing on some particular tables is expected.
>
> Why exactly does the existing LRU behavior of shared buffers not do
> what you need?
>

Lets assume a database having 3 tables, which are accessed regularly. The
user is expecting a faster query results on one table.
Because of LRU behavior which is not happening some times. So if we just
separate those table pages into an another buffer
pool then all the pages of that table resides in memory and gets faster
query processing.

Regards,
Hari Babu
Fujitsu Australia


Re: [HACKERS] Re: [COMMITTERS] pgsql: Add a GUC to report whether data page checksums are enabled.

2014-02-19 Thread Michael Paquier
On Thu, Feb 20, 2014 at 1:01 AM, David Fetter  wrote:
> On Tue, Feb 18, 2014 at 04:39:27PM -0300, Alvaro Herrera wrote:
>> Heikki Linnakangas wrote:
>> > Add a GUC to report whether data page checksums are enabled.
>>
>> Is there are reason this wasn't back-patched to 9.3?  I think it should
>> be.
>
> +1 for back-patching.
Back-patching would be interesting for existing applications, but -1
as it is a new feature :)
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] should we add a XLogRecPtr/LSN SQL type?

2014-02-19 Thread Michael Paquier
On Thu, Feb 20, 2014 at 2:47 AM, Robert Haas  wrote:
> On Wed, Feb 5, 2014 at 9:26 PM, Michael Paquier
>  wrote:
>> On Thu, Feb 6, 2014 at 3:48 AM, Peter Eisentraut  wrote:
>>> On 2/5/14, 1:31 PM, Robert Haas wrote:
 On Tue, Feb 4, 2014 at 3:26 PM, Peter Eisentraut  wrote:
> Perhaps this type should be called pglsn, since it's an
> implementation-specific detail and not a universal concept like int,
> point, or uuid.

 If we're going to do that, I suggest pg_lsn rather than pglsn.  We
 already have pg_node_tree, so using underscores for separation would
 be more consistent.
>>>
>>> Yes, that's a good precedent in multiple ways.
>> Here are updated patches to use pg_lsn instead of pglsn...
>
> OK, so I think this stuff is all committed now, with assorted changes.
>  Thanks for your work on this.
Thanks!
Oops, it looks like I am coming after the battle (time difference does
not help). I'll be more careful to test such patches on 32b platforms
as well in the future.
Regards,
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Priority table or Cache table

2014-02-19 Thread Tom Lane
Haribabu Kommi  writes:
> I want to propose a new feature called "priority table" or "cache table".
> This is same as regular table except the pages of these tables are having
> high priority than normal tables. These tables are very useful, where a
> faster query processing on some particular tables is expected.

Why exactly does the existing LRU behavior of shared buffers not do
what you need?

I am really dubious that letting DBAs manage buffers is going to be
an improvement over automatic management.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] narwhal and PGDLLIMPORT

2014-02-19 Thread Tom Lane
Hiroshi Inoue  writes:
>> (2014/02/12 3:03), Tom Lane wrote:
>>> Also, the only remaining usage of dllwrap is in src/bin/pgevent/Makefile.
>>> Do we need that either?

> Sorry for the late reply.
> Attached is a patch to remove dllwarp from pgevent/Makefile.

Pushed, thanks.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Priority table or Cache table

2014-02-19 Thread Haribabu Kommi
Hi,

I want to propose a new feature called "priority table" or "cache table".
This is same as regular table except the pages of these tables are having
high priority than normal tables. These tables are very useful, where a
faster query processing on some particular tables is expected.

The same faster query processing can be achieved by placing the tables on a
tablespace of ram disk. In this case there is a problem of data loss in
case of system shutdown. To avoid this there is a need of continuous backup
of this tablespace and WAL files is required. The priority table feature
will solve these problems by providing the similar functionality.

User needs a careful decision in deciding how many tables which require a
faster access, those can be declared as priority tables and also these
tables should be in small in both number of columns and size.


New syntax:

create [priority] Table ...;

or

Create Table .. [ buffer_pool = priority | default ];

By adding a new storage parameter of buffer_pool to specify the type of
buffer pool this table can use.

The same can be extended for index also.


Solution -1:

This solution may not be a proper one, but it is simple. So while placing
these table pages into buffer pool, the usage count is changed to double
max buffer usage count instead of 1 for normal tables. Because of this
reason there is a less chance of these pages will be moved out of buffer
pool. The queries which operates on these tables will be faster because of
less I/O. In case if the tables are not used for a long time, then only the
first query on the table will be slower and rest of the queries are faster.

Just for test, a new bool member can be added to RELFILENODE structure to
indicate the table type is priority or not. Using this while loading the
page the usage count can be modified.

The pg_buffercache output of a priority table:

postgres=# select * from pg_buffercache where relfilenode=16385;
 bufferid | relfilenode | reltablespace | reldatabase | relforknumber |
relblocknumber | isdirty | usagecount
---+---+---+-++-+-+
   270 |16385 |   1663 |  12831 |
0 |  0 |t   | 10


Solution - 2:

By keeping an extra flag in the buffer to know whether the buffer is used
for a priority table or not? By using this flag while replacing a buffer
used for priority table some extra steps needs to be taken care like
1. Only another page of priority table can replace this priority page.
2. Only after at least two complete cycles of clock sweep, a normal table
page can replace this.

In this case the priority buffers are present in memory for long time as
similar to the solution-1, but not guaranteed always.


Solution - 3:

Create an another buffer pool called "priority buffer pool" similar to
shared buffer pool to place the priority table pages. A new guc parameter
called "priority_buffers" can be added to the get the priority buffer pool
size from the user. The Maximum limit of these buffers can be kept smaller
value to make use of it properly.

As an extra care, whenever any page needs to move out of the priority
buffer pool a warning is issued, so that user can check whether the
configured the priority_buffers size is small or the priority tables are
grown too much as not expected?

In this case all the pages are always loaded into memory thus the queries
gets the faster processing.

IBM DB2 have the facility of creating one more buffer pools and fixing
specific tables and indexes into them. Oracle is also having a facility to
specify the buffer pool option as keep or recycle.

I am preferring syntax-2 and solution-3. please provide your
suggestions/improvements.

Regards,
Hari Babu
Fujitsu Australia


Re: [HACKERS] should we add a XLogRecPtr/LSN SQL type?

2014-02-19 Thread Andres Freund
On 2014-02-19 12:47:40 -0500, Robert Haas wrote:
> On Wed, Feb 5, 2014 at 9:26 PM, Michael Paquier
>  wrote:
> >> Yes, that's a good precedent in multiple ways.
> > Here are updated patches to use pg_lsn instead of pglsn...
> 
> OK, so I think this stuff is all committed now, with assorted changes.
>  Thanks for your work on this.

cool, thanks you two.

I wonder if pg_stat_replication shouldn't be updated to use it as well?
SELECT * FROM pg_attribute WHERE attname ~ '(location|lsn)'; only shows
that as names that are possible candidates for conversion.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GiST support for inet datatypes

2014-02-19 Thread Tom Lane
Emre Hasegeli  writes:
> [ cites bug #5705 ]

Hm.  I had forgotten how thoroughly broken btree_gist's inet and cidr
opclasses are.  There was discussion at the time of just ripping them
out despite the compatibility hit.  We didn't do it, but if we had
then life would be simpler now.

Perhaps it would be acceptable to drop the btree_gist implementation
and teach pg_upgrade to refuse to upgrade if the old database contains
any such indexes.  I'm not sure that solves the problem, though, because
I think pg_upgrade will still fail if the opclass is in the old database
even though unused (because you'll get the complaint about multiple
default opclasses anyway).  The only simple way to avoid that is to not
have btree_gist loaded at all in the old DB, and I doubt that's an
acceptable upgrade restriction.  It would require dropping indexes of
the other types supported by btree_gist, which would be a pretty hard
sell since those aren't broken.

Really what we'd need here is for btree_gist to be upgraded to a version
that doesn't define gist_inet_ops (or at least doesn't mark it as default)
*before* pg_upgrading to a server version containing the proposed new
implementation.  Not sure how workable that is.  Could we get away with
having pg_upgrade unset the default flag in the old database?
(Ick, but there are no very good alternatives here ...)

BTW, I'd not been paying any attention to this thread, but now that
I scan through it, I think the proposal to change the longstanding
names of the inet operators has zero chance of being accepted.
Consistency with the names chosen for range operators is a mighty
weak argument compared to backwards compatibility.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_standby: Question about truncation of trigger file in fast failover

2014-02-19 Thread Heikki Linnakangas

On 02/19/2014 11:15 PM, Neil Thombre wrote:

And that is where I have a question. I noticed that in pg_standby.c when we
detect the word "fast" in the trigger file we truncate the file.

https://github.com/postgres/postgres/blob/REL9_1_11/contrib/pg_standby/pg_standby.c#L456

There is also a comment above it about not "upsetting" the server.

https://github.com/postgres/postgres/blob/REL9_1_11/contrib/pg_standby/pg_standby.c#L454

What is the purpose of truncating the file? To do a smart failover once you
come out of standby? But, when I look at xlog.c, when we come out of
standby due to a failure returned by restore_command, we call
CheckForStandbyTrigger() here:

https://github.com/postgres/postgres/blob/REL9_1_11/src/backend/access/transam/xlog.c#L10441

Now, CheckForStandbyTrigger() unlinks the trigger file. I noticed through
the debugger that the unlinking happens before xlog.c makes a call to the
next restore_command.  So, what is the reason for truncating the "fast"
word from the trigger file if the file is going to be deleted soon after it
is discovered? How will we "upset" the server if we don't?


At end-of-recovery, the server will fetch again the last WAL file that 
was replayed. If it can no longer find it, because restore_command now 
returns an error even though it succeeded for the same file few seconds 
earlier, it will throw an error and refuse to start up.


That's the way it used to be until 9.2, anyway. In 9.2, the behavior was 
changed, so that the server keeps all the files restored from archive, 
in pg_xlog, so that it can access them again. I haven't tried, but it's 
possible that the truncation is no longer necessary. Try it, with 9.1 
and 9.3, and see what happens.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GiST support for inet datatypes

2014-02-19 Thread Emre Hasegeli
2014-02-19 16:52, Tom Lane :
> Not at all, AFAICS.  If it were okay to decide that some formerly-default
> opclass is no longer default, then having such a command would be better
> than manually manipulating pg_opclass.opcdefault --- but extension upgrade
> scripts could certainly do the latter if they had to.  The problem here
> is whether it's upgrade-safe to make such a change at all; having
> syntactic sugar doesn't make it safer.

It is not hard to support != operator on the new operator class. That would
make it functionally compatible with the ones on btree_gist. That way,
changing the default would not break pg_dump dumps, it would only upgrade
the index to the new one.

pg_upgrade dumps current objects in the extension. It fails to restore them,
if the new operator class exists as the default. I do not really understand
how pg_upgrade handle extension upgrades. I do not have a solution to this.

It would be sad if we cannot make the new operator class default because of
the btree_gist implementation which is not useful for inet data types. You
wrote on 2010-10-11 about it [1]:

> Well, actually the btree_gist implementation for inet is a completely
> broken piece of junk: it thinks that convert_network_to_scalar is 100%
> trustworthy and can be used as a substitute for the real comparison
> functions, which isn't even approximately true.  I'm not sure why
> Teodor implemented it like that instead of using the type's real
> comparison functions, but it's pretty much useless if you want the
> same sort order that the type itself defines.

[1] http://www.postgresql.org/message-id/8973.1286841...@sss.pgh.pa.us


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] pg_standby: Question about truncation of trigger file in fast failover

2014-02-19 Thread Neil Thombre
I was trying to understand  (and then perhaps mimic) how pg_standby does a
fast failover.

My current understanding is that when a secondary db is in standby mode, it
will exhaust all the archive log to be replayed from the primary and then
start streaming. It is at this point that xlog.c checks for the existence
of a trigger file to promote the secondary. This was been a cause of some
irritation for some of our customers who do not really care  about catching
up all the way. I want to achieve the exact semantics of pg_standby's fast
failover option.

I manipulated the restore command to return 'failure' when the word "fast"
is present in the trigger file (see below), hoping that when I want a
secondary database to come out fast, I can just echo the word "fast" into
the trigger file thereby simulating pg_standby's fast failover behavior.
However, that did not work. Techically, I did not truncate the trigger file
like how pg_standby.

 = ! fgrep -qsi fast  && 


And that is where I have a question. I noticed that in pg_standby.c when we
detect the word "fast" in the trigger file we truncate the file.

https://github.com/postgres/postgres/blob/REL9_1_11/contrib/pg_standby/pg_standby.c#L456

There is also a comment above it about not "upsetting" the server.

https://github.com/postgres/postgres/blob/REL9_1_11/contrib/pg_standby/pg_standby.c#L454

What is the purpose of truncating the file? To do a smart failover once you
come out of standby? But, when I look at xlog.c, when we come out of
standby due to a failure returned by restore_command, we call
CheckForStandbyTrigger() here:

https://github.com/postgres/postgres/blob/REL9_1_11/src/backend/access/transam/xlog.c#L10441

Now, CheckForStandbyTrigger() unlinks the trigger file. I noticed through
the debugger that the unlinking happens before xlog.c makes a call to the
next restore_command.  So, what is the reason for truncating the "fast"
word from the trigger file if the file is going to be deleted soon after it
is discovered? How will we "upset" the server if we don't?


Assuming this question is answered and I get a better understanding, I have
a follow up question. If  truncation is indeed necessary, can I simulate
the truncation by manipulating restore_command and achieve the same effect
as a fast failover in pg_standby?



Thanks in advance for the help.

Neil


Re: Fwd: [HACKERS] patch: make_timestamp function

2014-02-19 Thread Tom Lane
Alvaro Herrera  writes:
> My conclusion here is that the "time with time zone" datatype is broken
> in itself, because of this kind of ambiguity.

That's the conclusion that's been arrived at by pretty much everybody
who's looked at it with any care.

> Maybe we should just
> avoid offering more functionality on top of it, that is get rid of
> make_timetz() in this patch?

+1.  We don't need to encourage people to use that type.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Fwd: [HACKERS] patch: make_timestamp function

2014-02-19 Thread Pavel Stehule
Dne 19. 2. 2014 21:20 "Alvaro Herrera"  napsal(a):
>
> Pavel Stehule escribió:
>
> > I though about it, and now I am thinking so timezone in format
> > 'Europe/Prague' is together with time ambiguous
> >
> > We can do it, but we have to expect so calculation will be related to
> > current date - and I am not sure if it is correct, because someone can
> > write some like
> >
> > make_date(x,x,x) + make_timetz(..) - and result will be damaged.
>
> Hmm, I see your point --- the make_timetz() call would use today's
> timezone displacement, which might be different from the one used in the
> make_date() result.  That would result in a botched timestamptz
> sometimes, but it might escape testing because it's subtle and depends
> on the input data.
>
> However, your proposal is to use an abbreviation timezone, thereby
> forcing the user to select the correct timezone i.e. the one that
> matches the make_date() arguments.  I'm not sure this is much of an
> improvement, because then the user is faced with the difficult problem
> of figuring out the correct abbreviation in the first place.
>
> I think there is little we can do to solve the problem at this level; it
> seems to me that the right solution here is to instruct users to use
> make_date() only in conjunction with make_time(), that is, produce a
> timezone-less timestamp; and then apply a AT TIME ZONE operator to the
> result.  That could take a full timezone name, and that would always
> work correctly.
>
> My conclusion here is that the "time with time zone" datatype is broken
> in itself, because of this kind of ambiguity.  Maybe we should just
> avoid offering more functionality on top of it, that is get rid of
> make_timetz() in this patch?
>

+1

Pavel
> --
> Álvaro Herrerahttp://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Training & Services


Re: Fwd: [HACKERS] patch: make_timestamp function

2014-02-19 Thread Alvaro Herrera
Pavel Stehule escribió:

> I though about it, and now I am thinking so timezone in format
> 'Europe/Prague' is together with time ambiguous
> 
> We can do it, but we have to expect so calculation will be related to
> current date - and I am not sure if it is correct, because someone can
> write some like
> 
> make_date(x,x,x) + make_timetz(..) - and result will be damaged.

Hmm, I see your point --- the make_timetz() call would use today's
timezone displacement, which might be different from the one used in the
make_date() result.  That would result in a botched timestamptz
sometimes, but it might escape testing because it's subtle and depends
on the input data.

However, your proposal is to use an abbreviation timezone, thereby
forcing the user to select the correct timezone i.e. the one that
matches the make_date() arguments.  I'm not sure this is much of an
improvement, because then the user is faced with the difficult problem
of figuring out the correct abbreviation in the first place.

I think there is little we can do to solve the problem at this level; it
seems to me that the right solution here is to instruct users to use
make_date() only in conjunction with make_time(), that is, produce a
timezone-less timestamp; and then apply a AT TIME ZONE operator to the
result.  That could take a full timezone name, and that would always
work correctly.

My conclusion here is that the "time with time zone" datatype is broken
in itself, because of this kind of ambiguity.  Maybe we should just
avoid offering more functionality on top of it, that is get rid of
make_timetz() in this patch?

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Fwd: [HACKERS] patch: make_timestamp function

2014-02-19 Thread Pavel Stehule
2014-02-19 19:01 GMT+01:00 Alvaro Herrera :

> Pavel Stehule escribió:
>
> > > 7) Why do the functions accept only the timezone abbreviation, not the
> > >full name? I find it rather confusing, because the 'timezone' option
> > >uses the full name, and we're using this as the default. But doing
> > >'show timestamp' and using the returned value fails. Is it possible
> > >to fix this somehow?
> >
> > A only abbreviation is allowed for timetz type. Timestamp can work with
> > full time zone names. A rules (behave) should be same as input functions
> > for types: timestamptz and timetz.
> >
> > postgres=# select '10:10:10 CET'::timetz;
> >timetz
> > ─
> >  10:10:10+01
> > (1 row)
> >
> > postgres=# select '10:10:10 Europe/Prague'::timetz;
> > ERROR:  invalid input syntax for type time with time zone: "10:10:10
> > Europe/Prague"
> > LINE 1: select '10:10:10 Europe/Prague'::timetz;
> >^
> >
> > This limit is due used routines limits.
>
> I think this is a strange limitation, and perhaps it should be fixed
> rather than inflicting the limitation on the new function.
>

I though about it, and now I am thinking so timezone in format
'Europe/Prague' is together with time ambiguous

We can do it, but we have to expect so calculation will be related to
current date - and I am not sure if it is correct, because someone can
write some like

make_date(x,x,x) + make_timetz(..) - and result will be damaged.



>
> I tweaked your patch a bit, attached; other than defining what to do
> about full TZ names in timetz, this seems ready to commit.
>
> --
> Álvaro Herrerahttp://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Training & Services
>


Re: [HACKERS] Changeset Extraction v7.6.1

2014-02-19 Thread Euler Taveira
On 18-02-2014 06:33, Andres Freund wrote:
> I really hope there will be nicer ones by the time 9.4 is
> released. Euler did send in a json plugin
> http://archives.postgresql.org/message-id/52A5BFAE.1040209%2540timbira.com.br
> , but there hasn't too much feedback yet. It's hard to start discussing
> something that needs a couple of patches to pg before you can develop
> your own patch...
> 
BTW, I've updated that code to reflect the recent changes in the API and
publish it in [1]. This version is based on the Andres' branch
xlog-decoding-rebasing-remapping. I'll continue to polish this code.


Regards,


[1] https://github.com/eulerto/wal2json


-- 
   Euler Taveira   Timbira - http://www.timbira.com.br/
   PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Fwd: [HACKERS] patch: make_timestamp function

2014-02-19 Thread Pavel Stehule
2014-02-19 19:01 GMT+01:00 Alvaro Herrera :

> Pavel Stehule escribió:
>
> > > 7) Why do the functions accept only the timezone abbreviation, not the
> > >full name? I find it rather confusing, because the 'timezone' option
> > >uses the full name, and we're using this as the default. But doing
> > >'show timestamp' and using the returned value fails. Is it possible
> > >to fix this somehow?
> >
> > A only abbreviation is allowed for timetz type. Timestamp can work with
> > full time zone names. A rules (behave) should be same as input functions
> > for types: timestamptz and timetz.
> >
> > postgres=# select '10:10:10 CET'::timetz;
> >timetz
> > ─
> >  10:10:10+01
> > (1 row)
> >
> > postgres=# select '10:10:10 Europe/Prague'::timetz;
> > ERROR:  invalid input syntax for type time with time zone: "10:10:10
> > Europe/Prague"
> > LINE 1: select '10:10:10 Europe/Prague'::timetz;
> >^
> >
> > This limit is due used routines limits.
>
> I think this is a strange limitation, and perhaps it should be fixed
> rather than inflicting the limitation on the new function.
>
> I tweaked your patch a bit, attached; other than defining what to do
> about full TZ names in timetz, this seems ready to commit.
>

I have not a objection - thank you

Pavel


>
> --
> Álvaro Herrerahttp://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Training & Services
>


Re: [HACKERS] PoC: Partial sort

2014-02-19 Thread Marti Raudsepp
On Wed, Feb 12, 2014 at 11:54 PM, Marti Raudsepp  wrote:
> With partial-sort-basic-1 and this fix on the same test suite, the
> planner overhead is now a more manageable 0.5% to 1.3%; one test is
> faster by 0.5%.

Ping, Robert or anyone, does this overhead seem bearable or is that
still too much?

Do these numbers look conclusive enough or should I run more tests?

> I think the 1st patch now has a bug in initial_cost_mergejoin; you
> still pass the "presorted_keys" argument to cost_sort, making it
> calculate a partial sort cost

Ping, Alexander?

Regards,
Marti


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Changeset Extraction v7.6.1

2014-02-19 Thread Robert Haas
On Tue, Feb 18, 2014 at 4:33 AM, Andres Freund  wrote:
> On 2014-02-17 21:35:23 -0500, Robert Haas wrote:
>> What
>> I don't understand is why we're not taking the test_decoding module,
>> polishing it up a little to produce some nice, easily
>> machine-parseable output, calling it basic_decoding, and shipping
>> that.  Then people who want something else can build it, but people
>> who are happy with something basic will already have it.
>
> Because every project is going to need their own plugin
> *anyway*. Londiste, slony sure are going to ignore changes to relations
> they don't need. Querying their own metadata. They will want
> compatibility to the earlier formats as far as possible. Sometime not
> too far away they will want to optionally support binary output because
> it's so much faster.
> There's just not much chance that either of these will be able to agree
> on a format short term.

Ah, so part of what you're expecting the output plugin to do is
filtering.  I can certainly see where there might be considerable
variation between solutions in that area - but I think that's separate
from the question of formatting per se.  Although I think we should
have an in-core output plugin with filtering capabilities eventually,
I'm happy to define that as out of scope for 9.4.  But isn't there a
way that we can ship something that will due for people who want to
just see the database's entire change stream float by?

TBH, as compared to what you've got now, I think this mostly boils
down to a question of quoting and escaping.  I'm not really concerned
with whether we ship something that's perfectly efficient, or that has
filtering capabilities, or that has a lot of fancy bells and whistles.
 What I *am* concerned about is that if the user updates a text field
that contains characters like " or ' or : or [ or ] or , that somebody
might be using as delimiters in the output format, that a program can
still parse that output format and reliably determine what the actual
change was.  I don't care all that much whether we use JSON or CSV or
something custom, but the data that gets spit out should not have
SQL-injection-like vulnerabilities.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Draft release notes up for review

2014-02-19 Thread Marti Raudsepp
On Sun, Feb 16, 2014 at 10:41 PM, Tom Lane  wrote:
> Any comments before I start transposing them into the back branches?

Sorry I'm late.

> Shore up GRANT ... WITH ADMIN OPTION restrictions (Noah Misch)

I'm not familiar with the phrase "Shore up", I think it should use
more precise language: are the privilege checks getting more strict or
less strict?



Wow, there are quite a lot of items this time. Have you considered
grouping the items by their impact, for example security/data
corruption/crash/correctness/other? I think that would make it easier
for readers to find items they're interested in. Most changes seem
pretty straightforward to categorize; there are always outliers, but
even if a few items are miscategorized, that's an improvement over
what we have now. Of course someone has to  be willing to do that work.

If this warrants more discussion, I can draft out a proposal in a new topic.

Regards,
Marti


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Changeset Extraction v7.6.1

2014-02-19 Thread Robert Haas
On Tue, Feb 18, 2014 at 4:07 AM, Andres Freund  wrote:
>> 2. I think the snapshot-export code is fundamentally misdesigned.  As
>> I said before, the idea that we're going to export one single snapshot
>> at one particular point in time strikes me as extremely short-sighted.
>
> I don't think so. It's precisely what you need to implement a simple
> replication solution. Yes, there are usecases that could benefit from
> more possibilities, but that's always the case.
>
>>  For example, consider one-to-many replication where clients may join
>> or depart the replication group at any time.  Whenever somebody joins,
>> we just want a  pair such that they can apply all
>> changes after the LSN except for XIDs that would have been visible to
>> the snapshot.
>
> And? They need to create individual replication slots, which each will
> get a snapshot.

So we have to wait for startup N times, and transmit the change stream
N times, instead of once?  Blech.

>> And in fact, we don't even need any special machinery
>> for that; the client can just make a connection and *take a snapshot*
>> once decoding is initialized enough.
>
> No, they can't. Two reasons: For one the commit order between snapshots
> and WAL isn't necessarily the same.

So what?

> For another, clients now need logic
> to detect whether a transaction's contents has already been applied or
> has not been applied yet, that's nontrivial.

My point is, I think we should be trying to *make* that trivial,
rather than doing this.

>> 3. As this feature is proposed, the only plugin we'll ship with 9.4 is
>> a test_decoding plugin which, as its own documentation says, "doesn't
>> do anything especially useful."  What exactly do we gain by forcing
>> users who want to make use of these new capabilities to write C code?
>
> It gains us to have a output plugin in which we can easily demonstrate
> features so they can be tested in the regression tests. Which I find to
> be rather important.
> Just like e.g. the test_shm_mq stuff doesn't do anything really useful.

It definitely doesn't, but this patch is a lot closer to being done
than parallel query is, so I'm not sure it's a fair comparison.

>> The test_decoding plugin doesn't seem tremendously
>> much simpler than something that someone could actually use, so why
>> not make that the goal?
>
> For one, it being a designated toy plugin allows us to easily change it,
> to showcase/test new features. For another, I still don't agree that
> it's easy to agree to an output format. I think we should include some
> that matured into 9.5.

I regret that more effort has not been made in that area.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Fwd: [HACKERS] patch: make_timestamp function

2014-02-19 Thread Alvaro Herrera
Pavel Stehule escribió:

> > 7) Why do the functions accept only the timezone abbreviation, not the
> >full name? I find it rather confusing, because the 'timezone' option
> >uses the full name, and we're using this as the default. But doing
> >'show timestamp' and using the returned value fails. Is it possible
> >to fix this somehow?
> 
> A only abbreviation is allowed for timetz type. Timestamp can work with
> full time zone names. A rules (behave) should be same as input functions
> for types: timestamptz and timetz.
> 
> postgres=# select '10:10:10 CET'::timetz;
>timetz
> ─
>  10:10:10+01
> (1 row)
> 
> postgres=# select '10:10:10 Europe/Prague'::timetz;
> ERROR:  invalid input syntax for type time with time zone: "10:10:10
> Europe/Prague"
> LINE 1: select '10:10:10 Europe/Prague'::timetz;
>^
> 
> This limit is due used routines limits.

I think this is a strange limitation, and perhaps it should be fixed
rather than inflicting the limitation on the new function.

I tweaked your patch a bit, attached; other than defining what to do
about full TZ names in timetz, this seems ready to commit.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index be548d7..69eb45f 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -6725,6 +6725,32 @@ SELECT SUBSTRING('XY1234Z', 'Y*?([0-9]{1,3})');

 
  
+  make_interval
+ 
+ 
+  
+   make_interval(years int DEFAULT 0,
+   months int DEFAULT 0,
+   weeks int DEFAULT 0,
+   days int DEFAULT 0,
+   hours int DEFAULT 0,
+   mins int DEFAULT 0,
+   secs double precision DEFAULT 0.0)
+  
+ 
+
+interval
+
+ Create interval from years, months, weeks, days, hours, minutes and
+ seconds fields
+
+make_interval(days := 10)
+10 days
+   
+
+   
+
+ 
   make_time
  
  
@@ -6746,6 +6772,81 @@ SELECT SUBSTRING('XY1234Z', 'Y*?([0-9]{1,3})');

 
  
+  make_timetz
+ 
+ 
+  
+   make_timetz(hour int,
+   min int,
+   sec double precision,
+timezone text )
+  
+ 
+
+time with time zone
+
+ Create time with time zone from hour, minute and seconds fields. When
+ timezone is not specified, then current time zone
+ is used.
+
+make_timetz(8, 15, 23.5)
+08:15:23.5+01
+   
+
+   
+
+ 
+  make_timestamp
+ 
+ 
+  
+   make_timestamp(year int,
+   month int,
+   day int,
+   hour int,
+   min int,
+   sec double precision)
+  
+ 
+
+timestamp
+
+ Create timestamp from year, month, day, hour, minute and seconds fields
+
+make_timestamp(1-23, 7, 15, 8, 15, 23.5)
+2013-07-15 08:15:23.5
+   
+
+   
+
+ 
+  make_timestamptz
+ 
+ 
+  
+   make_timestamptz(year int,
+   month int,
+   day int,
+   hour int,
+   min int,
+   sec double precision,
+timezone text )
+  
+ 
+
+timestamp with time zone
+
+ Create timestamp with time zone from year, month, day, hour, minute
+ and seconds fields. When timezone is not specified,
+ then current time zone is used.
+
+make_timestamp(1-23, 7, 15, 8, 15, 23.5)
+2013-07-15 08:15:23.5+01
+   
+
+   
+
+ 
   now
  
  now()
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index f02efec..d0852f4 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -819,3 +819,9 @@ CREATE OR REPLACE FUNCTION
 CREATE OR REPLACE FUNCTION
   json_populate_recordset(base anyelement, from_json json, use_json_as_text boolean DEFAULT false)
   RETURNS SETOF anyelement LANGUAGE internal STABLE ROWS 100  AS 'json_populate_recordset';
+
+CREATE OR REPLACE FUNCTION
+  make_interval(years int4 DEFAULT 0, months int4 DEFAULT 0, weeks int4 DEFAULT 0,
+days int4 DEFAULT 0, hours int4 DEFAULT 0, mins int4 DEFAULT 0,
+secs double precision DEFAULT 0.0)
+  RETURNS interval STRICT IMMUTABLE LANGUAGE internal AS 'make_interval';
diff --git a/src/backend/utils/adt/date.c b/src/backend/utils/adt/date.c
index 06cc0cd..ba801e4 100644
--- a/src/backend/utils/adt/date.c
+++ b/src/backend/utils/adt/date.c
@@ -1106,6 +1106,7 @@ time_in(PG_FUNCTION_ARGS)
 static int
 tm2time(struct p

Re: [HACKERS] Changeset Extraction v7.6.1

2014-02-19 Thread Robert Haas
On Sun, Feb 16, 2014 at 12:12 PM, Andres Freund  wrote:
> On 2014-02-15 17:29:04 -0500, Robert Haas wrote:
>> On Fri, Feb 14, 2014 at 4:55 AM, Andres Freund  
>> wrote:
>> > [ new patches ]
>>
>> 0001 already needs minor
>>
>> + * copied stuff from tuptoaster.c. Perhaps there should be toast_internal.h?
>>
>> Yes, please.  If you can submit a separate patch creating this file
>> and relocating this stuff there, I will commit it.
>
> I started to work on that, but I am not sure we actually need it
> anymore. tuptoaster.h isn't included in that many places, so perhaps we
> should just add it there?

That seems fine to me.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Changeset Extraction v7.6.1

2014-02-19 Thread Robert Haas
On Sat, Feb 15, 2014 at 6:59 PM, Andres Freund  wrote:
> On 2014-02-15 17:29:04 -0500, Robert Haas wrote:
>> On Fri, Feb 14, 2014 at 4:55 AM, Andres Freund  
>> wrote:
>> > [ new patches ]
>>
>> 0001 already needs minor
>
> Hm?
>
> If there are conflicts, I'll push/send a rebased tomorrow or monday.

As you guessed, the missing word was "rebasing".  It's a trivial
conflict though, so please don't feel the need to repost just for
that.

>> +* the transaction's invalidations. This currently won't be needed if
>> +* we're just skipping over the transaction, since that currently 
>> only
>> +* happens when starting decoding, after we invalidated all caches, 
>> but
>> +* transactions in other databases might have touched shared 
>> relations.
>>
>> I'm having trouble understanding what this means, especially the part
>> after the "but".
>
> Let me rephrase, maybe that will help:
>
> This currently won't be needed if we're just skipping over the
> transaction because currenlty we only do so during startup, to get to
> the first transaction the client needs. As we have reset the catalog
> caches before starting to read WAL and we haven't yet touched any
> catalogs there can't be anything to invalidate.
>
> But if we're "forgetting" this commit because it's it happened in
> another database, the invalidations might be important, because they
> could be for shared catalogs and we might have loaded data into the
> relevant syscaches.
>
> Better?

Much!  Please include that text, or something like it.

>> +   if (IsTransactionState() &&
>> +   GetTopTransactionIdIfAny() != InvalidTransactionId)
>> +   ereport(ERROR,
>> +   (errcode(ERRCODE_ACTIVE_SQL_TRANSACTION),
>> +errmsg("cannot perform changeset
>> extraction in transaction that has performed writes")));
>>
>> This is sort of an awkward restriction, as it makes it hard to compose
>> this feature with others.  What underlies the restriction, can we get
>> rid of it, and if not can we include a comment here explaining why it
>> exists?
>
> Well, you can write the result into a table if you're halfway
> careful... :)
>
> I think it should be fairly easy to relax the restriction to creating a
> slot, but not getting data from it. Do you think that would that be
> sufficient?

That would be a big improvement, for sure, and might be entirely sufficient.

>> +   /* register output plugin name with slot */
>> +   strncpy(NameStr(slot->data.plugin), plugin,
>> +   NAMEDATALEN);
>> +   NameStr(slot->data.plugin)[NAMEDATALEN - 1] = '\0';
>>
>> If it's safe to do this without a lock, I don't know why.
>
> Well, the worst that could happen is that somebody else doing a SELECT *
> FROM pg_replication_slot gets a incomplete plugin name... But we
> certainly can hold the spinlock during it if you think that's better.

Isn't the worst thing that can happen that they copy garbage out of
the buffer, because the new name is longer than the old and only
partially written?

>> More broadly, I wonder why this is_init stuff is in here at all.
>> Maybe the stuff that happens in the is_init case should be done in the
>> caller, or another helper function.
>
> The reason I moved it in there is that after the walsender patch there
> are two callers and the stuff is sufficiently complex that I having it
> twice lead to bugs.
> The reason it's currenlty the same function is that sufficiently much of
> the code would have to be shared that I found it it ugly to split. I'll
> have a look whether I can figure something out.

I was thinking that the is_init portion could perhaps be done first,
in a *previous* function call, and adjusted in such a way that the
non-is-init path can be used for both case here.

>> I don't think this is a very good idea.  The problem with doing things
>> during error recovery that can themselves fail is that you'll lose the
>> original error, which is not cool, and maybe even blow out the error
>> stack.  Many people have confuse PG_TRY()/PG_CATCH() with an
>> exception-handling system, but it's not.  One way to fix this is to
>> put some of the initialization logic in ReplicationSlotCreate() just
>> prior to calling CreateSlotOnDisk().  If the work that needs to be
>> done is too complex or protracted to be done there, then I think that
>> it should be pulled out of the act of creating the replication slot
>> and made to happen as part of first use, or as a separate operation
>> like PrepareForLogicalDecoding.
>
> I think what should be done here is adding a drop_on_release flag. As
> soon as everything important is done, it gets unset.

That might be more elegant, but I don't think it really fixes
anything, because backing stuff out from on disk can fail.  AIUI, your
whole concern here is that you don't want the slot creation to fail
halfway through and leave behind the slot, but wh

Re: [HACKERS] should we add a XLogRecPtr/LSN SQL type?

2014-02-19 Thread Robert Haas
On Wed, Feb 5, 2014 at 9:26 PM, Michael Paquier
 wrote:
> On Thu, Feb 6, 2014 at 3:48 AM, Peter Eisentraut  wrote:
>> On 2/5/14, 1:31 PM, Robert Haas wrote:
>>> On Tue, Feb 4, 2014 at 3:26 PM, Peter Eisentraut  wrote:
 Perhaps this type should be called pglsn, since it's an
 implementation-specific detail and not a universal concept like int,
 point, or uuid.
>>>
>>> If we're going to do that, I suggest pg_lsn rather than pglsn.  We
>>> already have pg_node_tree, so using underscores for separation would
>>> be more consistent.
>>
>> Yes, that's a good precedent in multiple ways.
> Here are updated patches to use pg_lsn instead of pglsn...

OK, so I think this stuff is all committed now, with assorted changes.
 Thanks for your work on this.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [bug fix] "pg_ctl stop" times out when it should respond quickly

2014-02-19 Thread Robert Haas
On Mon, Feb 17, 2014 at 11:29 AM, Alvaro Herrera
 wrote:
> The pg_regress part is ugly.  However, pg_regress is doing something
> unusual when starting postmaster itself, so the ugly coding to stop it
> seems to match.  If we wanted to avoid the ugliness here, the right fix
> would be to use pg_ctl to start postmaster as well as to stop it.

I wonder if this would change the behavior in cases where we hit ^C
during the regression tests.  Right now I think that kills the
postmaster as well as pg_regress, but if we used pg_ctl, it might not,
because pg_regress uses fork()+exec(), but pg_ctl uses system() to
launch a shell which is in turn instructed to background the
postmaster.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Description for pg_replslot in docs

2014-02-19 Thread Robert Haas
On Mon, Feb 17, 2014 at 10:50 PM, Michael Paquier
 wrote:
> On Tue, Feb 18, 2014 at 12:43 PM, Amit Kapila  wrote:
>> Description for contents of PGDATA is mentioned at
>> following page in docs:
>> http://www.postgresql.org/docs/devel/static/storage-file-layout.html
>>
>> Isn't it better to have description of pg_replslot in the same
>> place?
> Definitely. +1.

Yep, good catch.  Fixed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] should we add a XLogRecPtr/LSN SQL type?

2014-02-19 Thread Robert Haas
On Wed, Feb 19, 2014 at 11:03 AM, Greg Stark  wrote:
> On Wed, Feb 19, 2014 at 3:11 PM, Robert Haas  wrote:
>> Hopefully the commit I just pushed will fix it.  It now works on my
>> machine with and without --disable-float8-byval.
>
> It builds and passes here on 32bits

The buildfarm seems happy so far, too.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] should we add a XLogRecPtr/LSN SQL type?

2014-02-19 Thread Greg Stark
On Wed, Feb 19, 2014 at 3:11 PM, Robert Haas  wrote:
> Hopefully the commit I just pushed will fix it.  It now works on my
> machine with and without --disable-float8-byval.

It builds and passes here on 32bits


-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WAL Rate Limiting

2014-02-19 Thread Robert Haas
On Wed, Feb 19, 2014 at 8:28 AM, Greg Stark  wrote:
> On Mon, Jan 20, 2014 at 5:37 PM, Simon Riggs  wrote:
>
>> Agreed; that was the original plan, but implementation delays
>> prevented the whole vision/discussion/implementation. Requirements
>> from various areas include WAL rate limiting for replication, I/O rate
>> limiting, hard CPU and I/O limits for security and mixed workload
>> coexistence.
>>
>> I'd still like to get something on this in 9.4 that alleviates the
>> replication issues, leaving wider changes for later releases.
>
> My first reaction was that we should just have a generic I/O resource
> throttling. I was only convinced this was a reasonable idea by the
> replication use case. It would help me to understand the specific
> situations where replication breaks down due to WAL bandwidth
> starvation. Heroku has had some problems with slaves falling behind
> though the immediate problems that causes is the slave filling up disk
> which we could solve more directly by switching to archive mode rather
> than slowing down the master.
>
> But I would suggest you focus on a specific use case that's
> problematic so we can judge better if the implementation is really
> fixing it.
>
>> The vacuum_* parameters don't allow any control over WAL production,
>> which is often the limiting factor. I could, for example, introduce a
>> new parameter for vacuum_cost_delay that provides a weighting for each
>> new BLCKSZ chunk of WAL, then rename all parameters to a more general
>> form. Or I could forget that and just press ahead with the patch as
>> is, providing a cleaner interface in next release.
>>
>>> It's also interesting to wonder about the relationship to
>>> CHECK_FOR_INTERRUPTS --- although I think that currently, we assume
>>> that that's *cheap* (1 test and branch) as long as nothing is pending.
>>> I don't want to see a bunch of arithmetic added to it.
>>
>> Good point.
>
> I think it should be possible to actually merge it into
> CHECK_FOR_INTERRUPTS. Have a single global flag
> io_done_since_check_for_interrupts which is set to 0 after each
> CHECK_FOR_INTERRUPTS and set to 1 whenever any wal is written. Then
> CHECK_FOR_INTERRUPTS turns into two tests and branches instead of one
> in the normal case.
>
> In fact you could do all the arithmetic when you do the wal write.
> Only set the flag if the bandwidth consumed is above the budget. Then
> the flag should only ever be set when you're about to sleep.
>
> I would dearly love to see a generic I/O bandwidth limits so it would
> be nice to see a nicely general pattern here that could be extended
> even if we only target wal this release.
>
> I'm going to read the existing patch now, do you think it's ready to
> go or did you want to do more work based on the feedback?

Well, *I* don't think this is ready to go.  A WAL rate limit that only
limits WAL sometimes still doesn't impress me.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Add a GUC to report whether data page checksums are enabled.

2014-02-19 Thread David Fetter
On Tue, Feb 18, 2014 at 04:39:27PM -0300, Alvaro Herrera wrote:
> Heikki Linnakangas wrote:
> > Add a GUC to report whether data page checksums are enabled.
> 
> Is there are reason this wasn't back-patched to 9.3?  I think it should
> be.

+1 for back-patching.

Cheers,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Add a GUC to report whether data page checksums are enabled.

2014-02-19 Thread Bernd Helmle



--On 18. Februar 2014 22:23:59 +0200 Heikki Linnakangas  
wrote:



I considered it a new feature, so not back-patching was the default. If
you want to back-patch it, I won't object.


That was my original feeling, too, but +1 for backpatching.

--
Thanks

Bernd


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] should we add a XLogRecPtr/LSN SQL type?

2014-02-19 Thread Robert Haas
On Wed, Feb 19, 2014 at 10:03 AM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Wed, Feb 19, 2014 at 9:30 AM, Andres Freund  
>> wrote:
>>> GET_8_BYTES only exists for 64bit systems.
>
>> Right, I got that far.  So it looks like float8, int8, timestamp,
>> timestamptz, and money all have behavior contingent on
>> USE_FLOAT8_BYVAL, making that symbol a misnomer in every way.  But
>> since we've already marched pretty far down that path I suppose we
>> should keep marching.
>
> You need somebody to help you with getting that working on 32-bit
> platforms?  Because it needs to get fixed, or reverted, PDQ.

Hopefully the commit I just pushed will fix it.  It now works on my
machine with and without --disable-float8-byval.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] should we add a XLogRecPtr/LSN SQL type?

2014-02-19 Thread Tom Lane
Robert Haas  writes:
> On Wed, Feb 19, 2014 at 9:30 AM, Andres Freund  wrote:
>> GET_8_BYTES only exists for 64bit systems.

> Right, I got that far.  So it looks like float8, int8, timestamp,
> timestamptz, and money all have behavior contingent on
> USE_FLOAT8_BYVAL, making that symbol a misnomer in every way.  But
> since we've already marched pretty far down that path I suppose we
> should keep marching.

You need somebody to help you with getting that working on 32-bit
platforms?  Because it needs to get fixed, or reverted, PDQ.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GiST support for inet datatypes

2014-02-19 Thread Tom Lane
Robert Haas  writes:
> On Mon, Feb 17, 2014 at 3:56 PM, Tom Lane  wrote:
>> We should probably expend some thought on a general approach to
>> replacing the default opclass for a datatype, because I'm sure this
>> will come up again.  Right now I don't see a feasible way.

> It seems odd for "default" to be a property of the opclass rather than
> a separate knob.  What comes to mind is something like:
> ALTER TYPE sniffle SET DEFAULT OPERATOR CLASS FOR btree TO achoo_ops;
> Not sure how much that helps.

Not at all, AFAICS.  If it were okay to decide that some formerly-default
opclass is no longer default, then having such a command would be better
than manually manipulating pg_opclass.opcdefault --- but extension upgrade
scripts could certainly do the latter if they had to.  The problem here
is whether it's upgrade-safe to make such a change at all; having
syntactic sugar doesn't make it safer.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Do you know the reason for increased max latency due to xlog scaling?

2014-02-19 Thread MauMau

From: "Jeff Janes" 

I thought that this was the point I was making, not the point I was
missing.  You have the same hard drives you had before, but now due to a
software improvement you are cramming 5 times more stuff through them.
Yeah, you will get bigger latency spikes.  Why wouldn't you?  You are now
beating the snot out of your hard drives, whereas before you were not.

If you need 10,000 TPS, then you need to upgrade to 9.4.  If you need it
with low maximum latency as well, then you probably need to get better IO
hardware as well (maybe not--maybe more tuning could help).  With 9.3 you
didn't need better IO hardware, because you weren't capable of maxing out
what you already had.  With 9.4 you can max it out, and this is a good
thing.

If you need 10,000 TPS but only 2000 TPS are completing under 9.3, then
what is happening to the other 8000 TPS? Whatever is happening to them, it
must be worse than a latency spike.

On the other hand, if you don't need 10,000 TPS, than measuring max 
latency

at 10,000 TPS is the wrong thing to measure.


Thank you, I've probably got the point --- you mean the hard disk for WAL is 
the bottleneck.  But I still wonder a bit why the latency spike became so 
bigger even with # of clients fewer than # of CPU cores.  I suppose the 
requests get processed more smoothly when the number of simultaneous 
requests is small.  Anyway, I want to believe the latency spike would become 
significantly smaller on an SSD.


Regards
MauMau




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] should we add a XLogRecPtr/LSN SQL type?

2014-02-19 Thread Robert Haas
On Wed, Feb 19, 2014 at 9:30 AM, Andres Freund  wrote:
> On 2014-02-19 09:24:03 -0500, Robert Haas wrote:
>> On Fri, Feb 14, 2014 at 9:32 PM, Michael Paquier
>>  wrote:
>> > On Thu, Feb 6, 2014 at 11:26 AM, Michael Paquier
>> >  wrote:
>> >> Here are updated patches to use pg_lsn instead of pglsn...
>> > Should I register this patch somewhere to avoid having it lost in the void?
>> > Regards,
>>
>> Well, I committed this, but the buildfarm's deeply unhappy with it.
>> Apparently the use of GET_8_BYTES() and SET_8_BYTES() is no good on
>> some platforms...  and I'm not sure what to do about that, right
>> off-hand.
>
> The relevant bit probably is:
>
> pg_lsn.c: In function 'pg_lsn_out':
> pg_lsn.c:59:2: warning: implicit declaration of function 'GET_8_BYTES' 
> [-Wimplicit-function-declaration]
>
> GET_8_BYTES only exists for 64bit systems.

Right, I got that far.  So it looks like float8, int8, timestamp,
timestamptz, and money all have behavior contingent on
USE_FLOAT8_BYVAL, making that symbol a misnomer in every way.  But
since we've already marched pretty far down that path I suppose we
should keep marching.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] should we add a XLogRecPtr/LSN SQL type?

2014-02-19 Thread Andres Freund
On 2014-02-19 09:24:03 -0500, Robert Haas wrote:
> On Fri, Feb 14, 2014 at 9:32 PM, Michael Paquier
>  wrote:
> > On Thu, Feb 6, 2014 at 11:26 AM, Michael Paquier
> >  wrote:
> >> Here are updated patches to use pg_lsn instead of pglsn...
> > Should I register this patch somewhere to avoid having it lost in the void?
> > Regards,
> 
> Well, I committed this, but the buildfarm's deeply unhappy with it.
> Apparently the use of GET_8_BYTES() and SET_8_BYTES() is no good on
> some platforms...  and I'm not sure what to do about that, right
> off-hand.

The relevant bit probably is:

pg_lsn.c: In function 'pg_lsn_out':
pg_lsn.c:59:2: warning: implicit declaration of function 'GET_8_BYTES' 
[-Wimplicit-function-declaration]

GET_8_BYTES only exists for 64bit systems.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] should we add a XLogRecPtr/LSN SQL type?

2014-02-19 Thread Robert Haas
On Fri, Feb 14, 2014 at 9:32 PM, Michael Paquier
 wrote:
> On Thu, Feb 6, 2014 at 11:26 AM, Michael Paquier
>  wrote:
>> Here are updated patches to use pg_lsn instead of pglsn...
> Should I register this patch somewhere to avoid having it lost in the void?
> Regards,

Well, I committed this, but the buildfarm's deeply unhappy with it.
Apparently the use of GET_8_BYTES() and SET_8_BYTES() is no good on
some platforms...  and I'm not sure what to do about that, right
off-hand.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WAL Rate Limiting

2014-02-19 Thread Greg Stark
On Mon, Jan 20, 2014 at 5:37 PM, Simon Riggs  wrote:

> Agreed; that was the original plan, but implementation delays
> prevented the whole vision/discussion/implementation. Requirements
> from various areas include WAL rate limiting for replication, I/O rate
> limiting, hard CPU and I/O limits for security and mixed workload
> coexistence.
>
> I'd still like to get something on this in 9.4 that alleviates the
> replication issues, leaving wider changes for later releases.

My first reaction was that we should just have a generic I/O resource
throttling. I was only convinced this was a reasonable idea by the
replication use case. It would help me to understand the specific
situations where replication breaks down due to WAL bandwidth
starvation. Heroku has had some problems with slaves falling behind
though the immediate problems that causes is the slave filling up disk
which we could solve more directly by switching to archive mode rather
than slowing down the master.

But I would suggest you focus on a specific use case that's
problematic so we can judge better if the implementation is really
fixing it.

> The vacuum_* parameters don't allow any control over WAL production,
> which is often the limiting factor. I could, for example, introduce a
> new parameter for vacuum_cost_delay that provides a weighting for each
> new BLCKSZ chunk of WAL, then rename all parameters to a more general
> form. Or I could forget that and just press ahead with the patch as
> is, providing a cleaner interface in next release.
>
>> It's also interesting to wonder about the relationship to
>> CHECK_FOR_INTERRUPTS --- although I think that currently, we assume
>> that that's *cheap* (1 test and branch) as long as nothing is pending.
>> I don't want to see a bunch of arithmetic added to it.
>
> Good point.

I think it should be possible to actually merge it into
CHECK_FOR_INTERRUPTS. Have a single global flag
io_done_since_check_for_interrupts which is set to 0 after each
CHECK_FOR_INTERRUPTS and set to 1 whenever any wal is written. Then
CHECK_FOR_INTERRUPTS turns into two tests and branches instead of one
in the normal case.

In fact you could do all the arithmetic when you do the wal write.
Only set the flag if the bandwidth consumed is above the budget. Then
the flag should only ever be set when you're about to sleep.

I would dearly love to see a generic I/O bandwidth limits so it would
be nice to see a nicely general pattern here that could be extended
even if we only target wal this release.

I'm going to read the existing patch now, do you think it's ready to
go or did you want to do more work based on the feedback?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GiST support for inet datatypes

2014-02-19 Thread Robert Haas
On Mon, Feb 17, 2014 at 3:56 PM, Tom Lane  wrote:
> Emre Hasegeli  writes:
>> How about only removing the inet and the cidr operator classes
>> from btree_gist. btree-gist-drop-inet-v2.patch does that.
>
> I'm not sure which part of "no" you didn't understand, but to be
> clear: you don't get to break existing installations.
>
> Assuming that this opclass is sufficiently better than the existing one,
> it would sure be nice if it could become the default; but I've not seen
> any proposal in this thread that would allow that without serious upgrade
> problems.  I think the realistic alternatives so far are (1) new opclass
> is not the default, or (2) this patch gets rejected.
>
> We should probably expend some thought on a general approach to
> replacing the default opclass for a datatype, because I'm sure this
> will come up again.  Right now I don't see a feasible way.

It seems odd for "default" to be a property of the opclass rather than
a separate knob.  What comes to mind is something like:

ALTER TYPE sniffle SET DEFAULT OPERATOR CLASS FOR btree TO achoo_ops;

Not sure how much that helps.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] inherit support for foreign tables

2014-02-19 Thread Etsuro Fujita

(2014/02/19 12:12), Kyotaro HORIGUCHI wrote:

At Tue, 18 Feb 2014 19:24:50 +0900, "Etsuro Fujita" wrote

From: Shigeru Hanada [mailto:shigeru.han...@gmail.com]



I'm not sure that allowing ALTER TABLE against parent table affects
descendants even some of them are foreign table.  I think the rule should
be simple enough to understand for users, of course it should be also
consistent and have backward compatibility.


Yeah, the understandability is important.  But I think the
flexibility is also important.  In other words, I think it is a
bit too inflexible that we disallow e.g., SET STORAGE to be set
on an inheritance tree that contains foreign table(s) because
we disallow SET STORAGE to be set on foreign tables directly.


What use case needs such a flexibility precedes the lost behavior
predictivity of ALTER TABLE and/or code "maintenancibility"(more
ordinally words must be...) ? I don't agree with the idea that
ALTER TABLE implicitly affects foreign children for the reason in
the upthread. Also turning on/off feature implemented as special
syntax seems little hope.


It is just my personal opinion, but I think it would be convenient for 
users to alter inheritance trees that contain foreign tables the same 
way as inheritance trees that don't contain any foreign tables, without 
making user conscious of the inheritance trees contains foreign tables 
or not.  Imagine we have an inheritance tree that contains only plain 
tables and then add a foreign table as a child of the inheritance tree. 
 Without the flexiblity, we would need to change the way of altering 
the structure of the inheritance tree (e.g., ADD CONSTRAINT) to a 
totally different one, immediately when adding the foreign table.  I 
don't think that would be easy to use.



What I think should be newly allowed to be set on foreign tables is

* ADD table_constraint
* DROP CONSTRAINT


As of 9.3

http://www.postgresql.org/docs/9.3/static/sql-alterforeigntable.html


Consistency with the foreign server is not checked when a
column is added or removed with ADD COLUMN or DROP COLUMN, a
NOT NULL constraint is added, or a column type is changed with
SET DATA TYPE. It is the user's responsibility to ensure that
the table definition matches the remote side.


So I belive implicit and automatic application of any constraint
on foreign childs are considerably danger.


We spent a lot of time discussing this issue, and the consensus is that 
it's users' fault if there are some tuples on the remote side violating 
a given constraint, as mentioned in the documentation.



* [NO] INHERIT parent_table


Is this usable for inheritance foreign children? NO INHERIT
removes all foreign children but INHERIT is nop.


I didn't express clearly.  Sorry for that.  Let me explain about it.

* ALTER FOREIGN TABLE target_table *INHERIT* parent_table: Add the 
target table as a new child of the parent table.
* ALTER FOREIGN TABLE target_table *NO INHERIT* parent_table: Remove the 
target table from the list of children of the parent table.


Thanks,

Best regards,
Etsuro Fujita


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Is anyone aware of data loss causing MultiXact bugs in 9.3.2?

2014-02-19 Thread Andres Freund
On 2014-02-19 00:55:03 -0800, Peter Geoghegan wrote:
> On Wed, Feb 19, 2014 at 12:40 AM, Andres Freund  
> wrote:
> > Was there an index only scan or just a index scan? Any chance of a
> > corrupted index?
> 
> Just an index scan. I think it's unlikely to be a corrupt index,
> because the customer said that he dropped and restored the index, and
> it's difficult to imagine how VACUUM FREEZE could then have impacted a
> plan with only a sequential scan.

I am wondering whether it's possibly either the vm or the all-visible
flag on the page...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Is anyone aware of data loss causing MultiXact bugs in 9.3.2?

2014-02-19 Thread Peter Geoghegan
On Wed, Feb 19, 2014 at 12:40 AM, Andres Freund  wrote:
> Was there an index only scan or just a index scan? Any chance of a
> corrupted index?

Just an index scan. I think it's unlikely to be a corrupt index,
because the customer said that he dropped and restored the index, and
it's difficult to imagine how VACUUM FREEZE could then have impacted a
plan with only a sequential scan.

> Do you still have the page from before you did the VACUUM?

It seems likely that I could get it, given that the problem persisted
over several days (apparently there was an earlier, manual attempt to
repair the damage by hand). I'm reasonably confident that I could get
back an affected database by performing a point-in-time recovery. That
would also give me the benefit of an environment that I could break as
needed.


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Is anyone aware of data loss causing MultiXact bugs in 9.3.2?

2014-02-19 Thread Andres Freund
On 2014-02-18 18:10:02 -0800, Peter Geoghegan wrote:
> On Tue, Feb 18, 2014 at 5:50 PM, Alvaro Herrera
>  wrote:
> > Peter Geoghegan wrote:
> >> I've had multiple complaints of apparent data loss on 9.3.2 customer
> >> databases. There are 2 total, both complaints from the past week, one
> >> of which I was able to confirm. The customer's complaint is that
> >> certain rows are either visible or invisible, depending on whether an
> >> index scan is used or a sequential scan (I confirmed this with an
> >> EXPLAIN ANALYZE).
> >
> > The multixact bugs would cause tuples to be hidden at the heap level.
> > If the tuples are visible in a seqscan, then these are more likely to be
> > related to index problems, not multixact problem.
> 
> I guess I wasn't clear enough here: The row in question was visible
> with a sequential scans but *not* with an index scan. So you have it
> backwards here (understandably).

Was there an index only scan or just a index scan? Any chance of a
corrupted index?

Do you still have the page from before you did the VACUUM?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers