date:20160124

Re: [HACKERS] Why format() adds double quote?

2016-01-24 Thread Tatsuo Ishii

> On Wed, Jan 20, 2016 at 4:20 AM, Pavel Stehule  
> wrote:
>>> If we would go this way, question is if we should back patch this or
>>> not since the patch apparently changes the existing
>>> behaviors. Comments?  I would think we should not.
>>
>> I am sure, so we should not backport this change. This can breaks customer
>> regress tests - and the current behave isn't 100% correct, but it is safe.
> 
> Quite.  This is not a bug fix.  It's a behavior change, perhaps for the 
> better.

Added to the commitfest 2016-03.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal: Trigonometric functions in degrees

2016-01-24 Thread Dean Rasheed

On 23 January 2016 at 23:04, Tom Lane  wrote:
> Noah Misch  writes:
>> On Sat, Jan 23, 2016 at 05:04:56PM -0500, Tom Lane wrote:
>>> Either I missed something or there's another issue, because tern/sungazer
>>> are *still* failing.  This is getting annoying :-(
>
>> sungazer's "make check" passes if I change init_degree_constants() to be
>> non-static.  Duping gcc isn't so easy these days.
>
> Ugh.  Well, at least we don't have to move it to another file, which was
> going to be my next larger size of hammer.
>
> Thanks for doing the legwork on this!
>

Hi, I'm just now catching up on my email after being out of town and
not reading it. Thanks for looking at this and sorting out those
issues, and thank you also Noah and Peter for your investigative work.

If I understand correctly there were 2 separate issues at play here:

1). On some platforms the compiler evaluates expressions like
sin(constant) and comes up with a slightly different result than a
runtime evaluation of the expression. The compiler-evaluated result is
presumably a 64-bit IEEE float, but at runtime it may be using
extended precision for intermediate results. That may well have been
the sole contributing factor to the fact that sind(30) wasn't exactly
0.5.

2). The compiler also sometimes rearranges expressions in ways that
don't give the same result as evaluating in the order suggested by the
parentheses. I think this actually explains the failure to get exactly
1 for tand(45). For x=45, this was being computed as

cosd_0_to_60(90 - x) / cosd_0_to_60(x)

so my guess is that it was inlining cosd_0_to_60(90 - x) and
rearranging it to produce something different from cosd_0_to_60(x) for
x=45.

Looking at the new code, it's annoying how much effort was needed to
prevent the compiler messing it up. I ought to have realised that the
optimiser would be awkward for this kind of thing.

I wonder if the same could have been achieved by disabling
optimisation and inlining in those low-level functions, and also
wrapping sin(x * RADIANS_PER_DEGREE) in a similar non-inlinable,
non-optimised function to force it to be executed at runtime when
passed a constant. That would probably have made them significantly
slower though, whereas the new code benefits from various pre-computed
expressions.

Thanks again for fixing this.

Regards,
Dean

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch: Implement failover on libpq connect level.

2016-01-24 Thread Victor Wagner

On Fri, 22 Jan 2016 16:36:15 -0300
Alvaro Herrera  wrote:

> You're editing the expected file for the libpq-regress thingy, but you
> haven't added any new lines to test the new capability.  I think it'd
> be good to add some there.  (I already said this earlier in the
> thread; is there any reason you ignored it the first time?)

I seriously doubt that this program can be used to test new
capabilities.

All it does, it calls PQconninfoParse and than examines some fields of
PGconn structure.

The only reason I've to modify expected output is that I changed usage
of one of these field, and keep there comma-separated list of host:port
pairs instead of just hostname.

Thus contents this field after parsing of some existing URIs is
changed, while semantic of URIs is same.

If I add some new uris, than only thing I can test is that comma is
properly copied from the URI to this field. And may be that some syntax
errors are properly detected.

> If the test program requires improvement to handle the new stuff,
> let's do that.

The only improvement I can think of, is to examine list of the addrinfo
structures into which host list is eventually parsed. But it is quite
problematic, because it depends on many factors which are outside of
our control. 

It stores addresses resolved via OS's name resolver.

For example, if we specify 'localhost' there it can be parsed into one
or to records, depending on presence of IPv6 support.

If we use some other hostname here, we'll rely on internet connectivity
and DNS system. And we cannot ensure that any name to IP mapping will
persist for a long enough time. 

So, I think that new functionality need other approach for testing.
There should be test of real connection to real temporary cluster.
Probably, based on Perl TAP framework which is actively used in the
Postgres recently.
> 

-- 
   Victor Wagner 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS]WIP: Covering + unique indexes.

2016-01-24 Thread Jeff Janes

On Fri, Jan 22, 2016 at 7:19 AM, Anastasia Lubennikova
 wrote:
>
> Done. I hope that my patch is close to the commit too.
>

Thanks for the update.

I've run into this problem:

create table foobar (x text, w text);
create unique index foobar_pkey on foobar (x) including (w);
alter table foobar add constraint foobar_pkey primary key using index
foobar_pkey;

ERROR:  index "foobar_pkey" does not have default sorting behavior
LINE 1: alter table foobar add constraint foobar_pkey primary key us...
   ^
DETAIL:  Cannot create a primary key or unique constraint using such an index.
Time: 1.577 ms

If I instead define the table as
create table foobar (x int, w xml);

Then I can create the index and then the primary key the first time I
do this in a session.  But then if I drop the table and repeat the
process, I get "does not have default sorting behavior" error even for
this index that previously succeeded, so I think there is some kind of
problem with the backend syscache or catcache.

create table foobar (x int, w xml);
create unique index foobar_pkey on foobar (x) including (w);
alter table foobar add constraint foobar_pkey primary key using index
foobar_pkey;
drop table foobar ;
create table foobar (x int, w xml);
create unique index foobar_pkey on foobar (x) including (w);
alter table foobar add constraint foobar_pkey primary key using index
foobar_pkey;
ERROR:  index "foobar_pkey" does not have default sorting behavior
LINE 1: alter table foobar add constraint foobar_pkey primary key us...
   ^
DETAIL:  Cannot create a primary key or unique constraint using such an index.

Cheers,

Jeff

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel Aggregate

2016-01-24 Thread Haribabu Kommi

On Sat, Jan 23, 2016 at 12:59 PM, Haribabu Kommi
 wrote:
>
> Here I attached updated patch with additional combine function for
> two stage aggregates also.

A wrong combine function was added in pg_aggregate.h in the earlier
patch that leading to
initdb problem. Corrected one is attached.

Regards,
Hari Babu
Fujitsu Australia


additional_combine_fns_v2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal: Trigonometric functions in degrees

2016-01-24 Thread Michael Paquier

On Mon, Jan 25, 2016 at 2:34 AM, Tom Lane  wrote:
> Perhaps we can fix this by rewriting as
>
> float8 numerator = 1.0 - cos(x * RADIANS_PER_DEGREE);
> return 1.0 - (numerator / one_minus_cos_60) / 2.0;
>
> cockatiel's compiler does recognize -fexcess-precision=standard, and
> my understanding of that is that the result put into "numerator" will
> be rounded to double width, so that it should then match
> "one_minus_cos_60".
>
> Another idea would be to change the cache variable to just "cos_60" and
> write "(1.0 - cos_60)" in the denominator --- but then we'd just be hoping
> that the compiler does both subtractions the same way, which doesn't seem
> necessarily guaranteed.  Worse, I believe the 8087 has an FCOS instruction
> which might deliver a wider-than-double result, so that maybe the problem
> is not so much with the subtraction as with when rounding of the cos()
> result happens.  The code I show above seems more likely to match the
> way one_minus_cos_60 is computed.

(Sorry for showing up after the storm)
The fix is working correctly, using gcc's i686-pc-cygwin on cygwin32
the regression does not show up anymore
after 0034757.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PoC: Partial sort

2016-01-24 Thread Alexander Korotkov

Hi, Tomas!

On Sat, Jan 23, 2016 at 3:07 PM, Tomas Vondra 
wrote:

> On 10/20/2015 01:17 PM, Alexander Korotkov wrote:
>
>> On Fri, Oct 16, 2015 at 7:11 PM, Alexander Korotkov
>> > wrote:
>>
>> On Sun, Jun 7, 2015 at 11:01 PM, Peter Geoghegan > > wrote:
>>
>> On Sun, Jun 7, 2015 at 8:10 AM, Andreas Karlsson
>> > wrote:
>> > Are you planning to work on this patch for 9.6?
>>
>> FWIW I hope so. It's a nice patch.
>>
>>
>> I'm trying to to whisk dust. Rebased version of patch is attached.
>> This patch isn't passing regression tests because of plan changes.
>> I'm not yet sure about those changes: why they happens and are they
>> really regression?
>> Since I'm not very familiar with planning of INSERT ON CONFLICT and
>> RLS, any help is appreciated.
>>
>>
>> Planner regression is fixed in the attached version of patch. It appears
>> that get_cheapest_fractional_path_for_pathkeys() behaved wrong when no
>> ordering is required.
>>
>>
> Alexander, are you working on this patch? I'd like to look at the patch,
> but the last available version (v4) no longer applies - there's plenty of
> bitrot. Do you plan to send an updated / rebased version?
>

I'm sorry that I didn't found time for this yet. I'm certainly planning to
get back to this in near future. The attached version is just rebased
without any optimization.

The main thing I'm particularly interested in is how much is this coupled
> with the Sort node, and whether it's possible to feed partially sorted
> tuples into other nodes.
>
> I'm particularly thinking about Hash Aggregate, because the partial sort
> allows to keep only the "current group" in a hash table, making it much
> more memory efficient / faster. What do you think?
>

This seems to me very reasonable optimization. And it would be nice to
implement some generalized way of presorted group processing. For instance,
we could have some special node, say "Group Scan" which have 2 children:
source and node which process every group. For "partial sort" the second
node would be Sort node. But it could be Hash Aggregate node as well.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


partial-sort-basic-5.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PoC: Partial sort

2016-01-24 Thread Alexander Korotkov

Hi!

On Sat, Jan 23, 2016 at 10:07 PM, Peter Geoghegan  wrote:

> On Sat, Jan 23, 2016 at 4:07 AM, Tomas Vondra
>  wrote:
> > The main thing I'm particularly interested in is how much is this coupled
> > with the Sort node, and whether it's possible to feed partially sorted
> > tuples into other nodes.
>
> That's cool, but I'm particularly interested in seeing Alexander get
> back to this because it's an important project on its own. We should
> really have this.
>

Thank you for your review and interest in this patch! I'm sorry for huge
delay I made. I'm going to get back to this soon.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: [HACKERS] Re: pglogical_output - a general purpose logical decoding output plugin

2016-01-24 Thread Tomasz Rybak

I'm merging all your emails for sake of easier discussion.
I also cut all fragments that do not require response.

W dniu 22.01.2016, pią o godzinie 11∶06 +0800, użytkownik Craig Ringer
napisał:
> > We might also think about changing name of plugin to something
> > resembling "logical_streaming_decoder" or even "logical_streamer"
> > 
> I'm open to ideas there but I'd want some degree of consensus before
> undertaking the changes required. 

I know that it'd require much changes (both in this and pglogical 
plugin), and thus don't want to press for name change.
On one hand - changing of name might be good to avoid tight mental
coupling between pglogical_output and pglogica. At the same time - it's
much work, and I cannot think of any short and nice name, so 
pglogical_output might stay IMO.

 
> >  + subset of that database may be selected for replication,
> > currently based on
> > + table and on replication origin. Filtering by a WHERE clause can
> > be supported
> > + easily in future.
> > 
> > Is this filtering by table and replication origin implemented? I
> > haven't
> > noticed it in source.
> That's what the hooks are for.
> 

Current documentation suggests that replicating only selected is 
already available:
+ A subset of that database may be selected for replication, currently
+ based on table and on replication origin.

"currently based on table and on replication origin" means to me that
current state of plugin allows for just chosing which tables
to replicate. I'd see something like:

"A subset of that database might be selected for replication, e.g.
only chosen tables or changes from particular origin, in custom hook"

to convey that user needs to provide hook for filtering.

>  
> > + [the `CREATE_REPLICATION_SLOT ... LOGICAL ...` or
> > `START_REPLICATION SLOT ... LOGICAL ...` commands](http://www.postg
> > resql.org/docs/current/static/logicaldecoding-walsender.html) to
> > start streaming changes. (It can also be used via
> > + [SQL level functions](http://www.postgresql.org/docs/current/stat
> > ic/logicaldecoding-sql.html)
> > + over a non-replication connection, but this is mainly for
> > debugging purposes)
> > 
> > Replication slot can also be configured (causing output plugin to
> > be loaded) via [SQL level functions]...
> Covered in the next section. Or at least it is in the SGML docs
> conversion I'm still trying to finish off..

OK, then I'll wait for the final version to review that.

>  
> > + * Client issues `CREATE_REPLICATION_SLOT slotname LOGICAL
> > 'pglogical'` if it's setting up for the first time
> > 
> > * Client issues `CREATE_REPLICATION_SLOT slotname LOGICAL
> > 'pglogical'` to setup replication if it's connecting for the first
> > time
> I disagree. It's entirely possible to do your slot creation/setup
> manually or via something else, re-use a slot first created by
> another node, etc. Slot creation is part of client setup, not so much
> connection.

I'd propose then:
' * Client issues "CREATE_REPLICATION_SLOT ..." if the replication
was not configured earler, e.g. during previous connection, or manually
via [SQL functions | link to documentation]"


> > + If your application creates its own slots on first use and hasn't
> > previously
> > + connected to this database on this system you'll need to create a
> > replication
> > + slot. This keeps track of the client's replay state even while
> > it's disconnected.
> > 
> > If your application hasn't previously connected to this database on
> > this system
> > it'll need to create and configure replication slot which keeps
> > track of the
> > client's replay state even while it's disconnected.
> As above, I don't quite agree.
>  

"If your application hasn't previously connedted to this database on
this system, and the replication slot was not configured through other
means (e.g. manually using [SQL functions | URL ] then you'll need
to create and configure replication slot ..."



> > 
> >  DESIGN.md:
> > 
> > + attnos don't necessarily correspond. The column names might, and
> > their ordering
> > + might even be the same, but any column drop or column type change
> > will result
> > 
> > The column names and their ordering might even be the same...
> I disagree, that has a different meaning. It's also not really user-
> facing docs so I'm not too worried about being quite as readable.
> 

I do not try to change meaning but fix grammar. I'd like to have here
either more or less commas. So either:

+ The column names (and their ordering) might ...

or:

+ The column names, and their ordering, might ...



> > Is it true (no way to access change data)? You added passing change
> > to C hooks; from looking at code it looks like it's true, but I
> > want to be sure.
> While the change data is now passed to the C hook, there's no attempt
> to expose it via PL/PgSQL. So yeah, that's still true.
> 

Thanks for confirming.

Speaking about flags - in most cases they are 0; only for
> > attributes
> > we might have: 
> >

Re: [HACKERS] Patch: fix lock contention for HASHHDR.mutex

2016-01-24 Thread Dilip Kumar

On Fri, Jan 22, 2016 at 3:44 PM, Aleksander Alekseev <
a.aleks...@postgrespro.ru> wrote:

> This patch affects header files. By any chance didn't you forget to run
> `make clean` after applying it? As we discussed above, when you
> change .h files autotools doesn't rebuild dependent .c files:
>

Yes, actually i always compile using "make clean;make -j20; make install"
If you want i will run it again may be today or tomorrow and post the
result.


-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] Add generate_series(date,date) and generate_series(date,date,integer)

2016-01-24 Thread Corey Huinker

>
>
> If it didn't respond to SIGINT, that would be an issue, but otherwise
> this doesn't seem much more exciting than any other way to create a
> query that will run longer than you want to wait.
>
> regards, tom lane
>

It responded to SIGINT, so yeah, meh.

I can see value in aligning the behavior of infinity queries between date
and timestamp, but I have no strong opinion about which behavior is better:
it's either set step = 0 or an ereport(), no biggie if we want to handle
the condition, I rip out the DATE_NOT_FINITE() checks.

Incidentally, is there a reason behind the tendency of internal functions
to avoid parameter defaults in favor of multiple pg_proc entries? I copied
the existing behavior of the int4 generate_series, but having one entry
with the defaults seemed more self-documenting.

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-01-24 Thread Michael Paquier

On Fri, Jan 22, 2016 at 9:32 PM, Michael Paquier
 wrote:
> On Fri, Jan 22, 2016 at 5:26 PM, Tomas Vondra
>  wrote:
>> On 01/22/2016 06:45 AM, Michael Paquier wrote:
>>> Here are some comments about your patch after a look at the code.
>>>
>>> Regarding the additions in fsync_fname() in xlog.c:
>>> 1) In InstallXLogFileSegment, rename() will be called only if
>>> HAVE_WORKING_LINK is not used, which happens only on Windows and
>>> cygwin. We could add it for consistency, but it should be within the
>>> #else/#endif block. It is not critical as of now.
>>> 2) The call in RemoveXlogFile is not necessary, the rename happening
>>> only on Windows.
>>
>> Hmmm, OK. Are we sure HAVE_WORKING_LINK is false only on Windows, or could
>> there be some other platforms? And are we sure the file systems on those
>> platforms are safe without the fsync call?
>> That is, while the report references ext4, there may be other file systems
>> with the same problem - ext4 was used mostly as it's the most widely used
>> Linux file system.
>
> From pg_config_manual.h:
> #if !defined(WIN32) && !defined(__CYGWIN__)
> #define HAVE_WORKING_LINK 1
> #endif
> If we want to be consistent with what Posix proposes, I am not against
> adding it.

I did some tests with NTFS using cygwin, and the rename() calls remain
even after powering off the VM. But I agree that adding an fsync() in
both cases would be fine.

>>> Thoughts?
>>
>> Thanks for the review and comments. I think the question is whether we only
>> want to do the additional fsync() only when it ultimately may lead to data
>> loss, or even in cases where it may cause operational issues (e.g. switching
>> back to recovery needlessly).
>> I'd vote for the latter, as I think it makes the database easier to operate
>> (less manual interventions) and the performance impact is 0 (as those fsyncs
>> are really rare).
>
> My first line of thoughts after looking at the patch is that I am not
> against adding those fsync calls on HEAD as there is roughly an
> advantage to not go back to recovery in most cases and ensure
> consistent names, but as they do not imply any data loss I would not
> encourage a back-patch. Adding them seems harmless at first sight I
> agree, but those are not actual bugs.

OK. It is true that PGDATA would be fsync'd in 4 code paths with your
patch which are not that much taken:
- Renaming tablespace map file and backup label file (three times)
- Renaming to recovery.done
So, what do you think about the patch attached? Moving the calls into
the critical sections is not really necessary except when installing a
new segment.
-- 
Michael
diff --git a/src/backend/access/transam/timeline.c b/src/backend/access/transam/timeline.c
index f6da673..4173a50 100644
--- a/src/backend/access/transam/timeline.c
+++ b/src/backend/access/transam/timeline.c
@@ -435,6 +435,12 @@ writeTimeLineHistory(TimeLineID newTLI, TimeLineID parentTLI,
 (errcode_for_file_access(),
  errmsg("could not rename file \"%s\" to \"%s\": %m",
 		tmppath, path)));
+
+	/*
+	 * Make sure the rename is permanent by fsyncing the parent
+	 * directory.
+	 */
+	fsync_fname(XLOGDIR, true);
 #endif
 
 	/* The history file can be archived immediately. */
@@ -525,6 +531,9 @@ writeTimeLineHistoryFile(TimeLineID tli, char *content, int size)
 (errcode_for_file_access(),
  errmsg("could not rename file \"%s\" to \"%s\": %m",
 		tmppath, path)));
+
+	/* Make sure the rename is permanent by fsyncing the directory. */
+	fsync_fname(XLOGDIR, true);
 #endif
 }
 
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index a2846c4..b124f90 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -3278,6 +3278,14 @@ InstallXLogFileSegment(XLogSegNo *segno, char *tmppath,
 		tmppath, path)));
 		return false;
 	}
+
+	/*
+	 * Make sure the rename is permanent by fsyncing the parent
+	 * directory.
+	 */
+	START_CRIT_SECTION();
+	fsync_fname(XLOGDIR, true);
+	END_CRIT_SECTION();
 #endif
 
 	if (use_lock)
@@ -3800,10 +3808,18 @@ RemoveXlogFile(const char *segname, XLogRecPtr PriorRedoPtr, XLogRecPtr endptr)
 	  path)));
 			return;
 		}
+
+		/*
+		 * Make sure the rename is permanent by fsyncing the parent
+		 * directory.
+		 */
+		fsync_fname(XLOGDIR, true);
+
 		rc = unlink(newpath);
 #else
 		rc = unlink(path);
 #endif
+
 		if (rc != 0)
 		{
 			ereport(LOG,
@@ -5297,6 +5313,9 @@ exitArchiveRecovery(TimeLineID endTLI, XLogRecPtr endOfLog)
  errmsg("could not rename file \"%s\" to \"%s\": %m",
 		RECOVERY_COMMAND_FILE, RECOVERY_COMMAND_DONE)));
 
+	/* Make sure the rename is permanent by fsyncing the data directory. */
+	fsync_fname(".", true);
+
 	ereport(LOG,
 			(errmsg("archive recovery complete")));
 }
@@ -6150,6 +6169,12 @@ StartupXLOG(void)
 TABLESPACE_MAP, BACKUP_LABEL_FILE),
 		 errdetail("Could not rename file \"%s\" to \"%s\": %m.",

Re: [HACKERS] Relation extension scalability

2016-01-24 Thread Dilip Kumar

On Sat, Jan 23, 2016 at 4:28 PM, Amit Kapila 
wrote:

> I found one more problem with patch.
>
> ! UnlockReleaseBuffer(buffer);
> ! RecordPageWithFreeSpace(relation, BufferGetBlockNumber(buffer),
> freespace);
>
> You can't call BufferGetBlockNumber(buffer) after releasing
> the pin on buffer which will be released by
> UnlockReleaseBuffer().  Get the block number before unlocking
> the buffer.
>

Good catch, will fix this also in next version.


-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] Add generate_series(date,date) and generate_series(date,date,integer)

2016-01-24 Thread Tom Lane

Corey Huinker  writes:
> Incidentally, is there a reason behind the tendency of internal functions
> to avoid parameter defaults in favor of multiple pg_proc entries?

Yes: you can't specify defaults in pg_proc.h.

We work around that where absolutely necessary, see the last part of
system_views.sql.  But it's messy enough, and bug-prone enough, to be
better avoided --- for example, it's very easy for the redeclaration
in system_view.sql to forget a STRICT or other similar marking.

Personally I'd say that every one of the existing cases that simply has
a default for the last argument was a bad idea, and would better have
been done in the traditional way with two pg_proc.h entries.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Add generate_series(date,date) and generate_series(date,date,integer)

2016-01-24 Thread Corey Huinker

This patch addresses a personal need: nearly every time I use
generate_series for timestamps, I end up casting the result into date or
the ISO string thereof. Like such:

SELECT d.dt::date as dt
FROM generate_series('2015-01-01'::date,
 '2016-01-04'::date,
 interval '1 day') AS d(dt);


That's less than elegant.

With this patch, we can do this:


SELECT d.date_val FROM
generate_series('1991-09-24'::date,'1991-10-01'::date) as d(date_val);
  date_val

 1991-09-24
 1991-09-25
 1991-09-26
 1991-09-27
 1991-09-28
 1991-09-29
 1991-09-30
 1991-10-01
(8 rows)

SELECT d.date_val FROM
generate_series('1991-09-24'::date,'1991-10-01'::date,7) as d(date_val);
  date_val

 1991-09-24
 1991-10-01
(2 rows)

SELECT d.date_val FROM
generate_series('1999-12-31'::date,'1999-12-29'::date,-1) as d(date_val);
  date_val

 1999-12-31
 1999-12-30
 1999-12-29
(3 rows)


One thing I discovered in doing this patch is that if you do a timestamp
generate_series involving infinityit tries to do it. I didn't wait to
see if it finished.

For the date series, I put in checks to return an empty set:

SELECT d.date_val FROM
generate_series('-infinity'::date,'1999-12-29'::date) as d(date_val);
 date_val
--
(0 rows)

SELECT d.date_val FROM generate_series('1991-09-24'::date,'infinity'::date)
as d(date_val);
 date_val
--
(0 rows)



Notes:
- I borrowed the int4 implementation's check for step-size of 0 for POLA
reasons. However, it occurred to me that the function might be leakproof if
the behavior where changed to instead return an empty set. I'm not sure
that leakproof is a goal in and of itself.

First attempt at this patch attached. The examples above are copied from
the new test cases.
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 9c143b2..15ebe47 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -14657,6 +14657,26 @@ AND
   
  
 
+ 
+  generate_series(start, 
stop)
+  date
+  setof date
+  
+   Generate a series of values, from start to 
stop
+   with a step size of one day
+  
+ 
+
+ 
+  generate_series(start, 
stop, step 
integer)
+  date
+  setof date
+  
+   Generate a series of values, from start to 
stop
+   with a step size of step
+  
+ 
+
 

   
@@ -14721,6 +14741,26 @@ SELECT * FROM generate_series('2008-03-01 
00:00'::timestamp,
  2008-03-03 22:00:00
  2008-03-04 08:00:00
 (9 rows)
+
+SELECT d.date_val FROM generate_series('1991-09-24'::date,'1991-10-01'::date) 
as d(date_val);
+  date_val  
+
+ 1991-09-24
+ 1991-09-25
+ 1991-09-26
+ 1991-09-27
+ 1991-09-28
+ 1991-09-29
+ 1991-09-30
+ 1991-10-01
+(8 rows)
+
+SELECT d.date_val FROM 
generate_series('1991-09-24'::date,'1991-10-01'::date,7) as d(date_val);
+  date_val  
+
+ 1991-09-24
+ 1991-10-01
+(2 rows)
 
   
 
diff --git a/src/backend/utils/adt/date.c b/src/backend/utils/adt/date.c
index 332db7e..7404a2f 100644
--- a/src/backend/utils/adt/date.c
+++ b/src/backend/utils/adt/date.c
@@ -30,6 +30,7 @@
 #include "utils/datetime.h"
 #include "utils/nabstime.h"
 #include "utils/sortsupport.h"
+#include "funcapi.h"
 
 /*
  * gcc's -ffast-math switch breaks routines that expect exact results from
@@ -2811,3 +2812,97 @@ timetz_izone(PG_FUNCTION_ARGS)
 
PG_RETURN_TIMETZADT_P(result);
 }
+
+/* Corey BEGIN */
+typedef struct
+{
+   DateADT current;
+   DateADT finish;
+   int32   step;
+} generate_series_date_fctx;
+
+
+/* generate_series_date()
+ * Generate the set of dates from start to finish by step
+ */
+Datum
+generate_series_date(PG_FUNCTION_ARGS)
+{
+   return generate_series_step_date(fcinfo);
+}
+
+Datum
+generate_series_step_date(PG_FUNCTION_ARGS)
+{
+   FuncCallContext *funcctx;
+   generate_series_date_fctx *fctx;
+   DateADT result;
+
+   /* stuff done only on the first call of the function */
+   if (SRF_IS_FIRSTCALL())
+   {
+   DateADT start = PG_GETARG_DATEADT(0);
+   DateADT finish = PG_GETARG_DATEADT(1);
+   int32   step = 1;
+
+   /* see if we were given an explicit step size */
+   if (PG_NARGS() == 3)
+   step = PG_GETARG_INT32(2);
+   if (step == 0)
+   ereport(ERROR,
+   
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+errmsg("step size cannot equal 
zero")));
+
+   MemoryContext oldcontext;
+
+   /* create a function context for cross-call persistence */
+   funcctx = SRF_FIRSTCALL_INIT();
+
+   /*
+* switch to memory context appropriate for multiple function 
calls
+*/
+   oldcontext = 
MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);
+
+

Re: [HACKERS] Relation extension scalability

2016-01-24 Thread Dilip Kumar

On Sat, Jan 23, 2016 at 12:19 PM, Amit Kapila 
wrote
>
>
> Few comments about patch:
>
Thanks for reviewing..


> 1.
> Patch is not getting compiled.
>
> 1>src/backend/access/heap/hio.c(480): error C2065: 'buf' : undeclared
> identifier
> 1>src/backend/access/heap/hio.c(480): error C2065: 'buf' : undeclared
> identifier
> 1>src/backend/access/heap/hio.c(480): error C2065: 'buf' : undeclared
> identifier
>
Oh, My mistake, my preprocessor is ignoring this error and relacing it with
BLKSIZE

I will fix in next version of patch.

> 2.
> ! page = BufferGetPage(buffer);
> ! PageInit(page, BufferGetPageSize
> (buf), 0);
> !
> ! freespace = PageGetHeapFreeSpace(page);
> !
> MarkBufferDirty(buffer);
> ! UnlockReleaseBuffer(buffer);
> !
> RecordPageWithFreeSpace(relation, BufferGetBlockNumber(buffer), freespace);
>
> What is the need to mark page dirty here, won't it automatically
> be markerd dirty once the page is used?  I think it is required
> if you wish to WAL-log this action.
>

These pages  are not going to be used immediately and we have done PageInit
so i think it should be marked dirty before adding to FSM, so that if
buffer get replaced out then it flushes the init data.


> 3. I think you don't need to multi-extend a relation if
> HEAP_INSERT_SKIP_FSM is used, as for that case it anyways try to
> get a new page by extending a relation.
>

Yes, if HEAP_INSERT_SKIP_FSM is used and we use multi-extend atleast in
current transaction it will not take pages from FSM and everytime it will
do multi-extend, however pages will be used if there are parallel backend,
but still not a good idea to extend every time in multiple chunk in current
backend.

So i will change this..

4. Again why do you need this multi-extend optimization for local
> relations (those only accessible to current backend)?
>

I think we can change this while adding the  table level "extend_by_blocks"
for local table we will not allow this property, so no need to change at
this place.

What do you think ?

5. Do we need this for nbtree as well?  One way to check that
> is by Copying large data in table having index.
>
> Ok, i will try this test and update.



> Note: Test with both data and WAL on Magnetic Disk : No significant
>> improvement visible
>> -- I think wall write is becoming bottleneck in this case.
>>
>>
> In that case, can you try the same test with un-logged tables?
>

OK

>
> Also, it is good to check the performance of patch with read-write work
> load to ensure that extending relation in multiple-chunks should not
> regress such cases.
>

Ok


>
> Currently i have kept extend_num_page as session level parameter but i
>> think later we can make this as table property.
>> Any suggestion on this ?
>>
>>
> I think we should have a new storage_parameter at table level
> extend_by_blocks or something like that instead of GUC. The
> default value of this parameter should be 1 which means retain
> current behaviour by default.
>

+1


-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

[HACKERS] brin_summarize_new_values error checking

2016-01-24 Thread Jeff Janes

In reviewing one of my patches[1], Fujii-san has pointed out that I
didn't include checks for being in recovery, or for working on another
backend's temporary index.

I think that brin_summarize_new_values in 9.5.0 commits those same
sins. In its case, I don't think those are critical, as they just
result in getting less specific error messages that one might hope
for, rather than something worse like a panic.

But still, we might want to address them.

Cheers,

Jeff

[1] 
http://www.postgresql.org/message-id/CAHGQGwH=m1baejpqdpjjcneqwg8xa+p8sb+zsvhvwh6gl2j...@mail.gmail.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Add generate_series(date,date) and generate_series(date,date,integer)

2016-01-24 Thread Michael Paquier

On Mon, Jan 25, 2016 at 3:00 PM, Corey Huinker  wrote:
> This patch addresses a personal need: nearly every time I use
> generate_series for timestamps, I end up casting the result into date or the
> ISO string thereof. Like such:
>
> [...]
>
> One thing I discovered in doing this patch is that if you do a timestamp
> generate_series involving infinityit tries to do it. I didn't wait to
> see if it finished.

Well, I would think that this is a bug that we had better address and
backpatch. It does not make much sense to use infinity for timestamps,
but letting it run infinitely is not good either.

> For the date series, I put in checks to return an empty set:
>
> SELECT d.date_val FROM generate_series('-infinity'::date,'1999-12-29'::date)
> as d(date_val);
>  date_val
> --
> (0 rows)
>
> SELECT d.date_val FROM generate_series('1991-09-24'::date,'infinity'::date)
> as d(date_val);
>  date_val
> --
> (0 rows)

Wouldn't a proper error be more adapted?
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Add generate_series(date,date) and generate_series(date,date,integer)

2016-01-24 Thread Tom Lane

Michael Paquier  writes:
> On Mon, Jan 25, 2016 at 3:00 PM, Corey Huinker  
> wrote:
>> One thing I discovered in doing this patch is that if you do a timestamp
>> generate_series involving infinityit tries to do it. I didn't wait to
>> see if it finished.

> Well, I would think that this is a bug that we had better address and
> backpatch. It does not make much sense to use infinity for timestamps,
> but letting it run infinitely is not good either.

Meh.  Where would you cut it off?  AD 100?  A few zeroes either
way doesn't really make much difference.

If it didn't respond to SIGINT, that would be an issue, but otherwise
this doesn't seem much more exciting than any other way to create a
query that will run longer than you want to wait.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GIN pending list clean up exposure to SQL

2016-01-24 Thread Jeff Janes

On Wed, Jan 20, 2016 at 6:17 AM, Fujii Masao  wrote:
> On Sat, Jan 16, 2016 at 7:42 AM, Julien Rouhaud
>  wrote:
>> On 15/01/2016 22:59, Jeff Janes wrote:
>>> On Sun, Jan 10, 2016 at 4:24 AM, Julien Rouhaud
>>>  wrote:
>>
>> All looks fine to me, I flag it as ready for committer.
>
> When I compiled the PostgreSQL with the patch, I got the following error.
> ISTM that the inclusion of pg_am.h header file is missing in ginfast.c.

Thanks.  Fixed.

> gin_clean_pending_list() must check whether the server is in recovery or not.
> If it's in recovery, the function must exit with an error. That is, IMO,
> something like the following check must be added.
>
>  if (RecoveryInProgress())
>  ereport(ERROR,
>
> (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
>   errmsg("recovery is in progress"),
>   errhint("GIN pending list cannot be
> cleaned up during recovery.")));
>
> It's better to make gin_clean_pending_list() check whether the target index
> is temporary index of other sessions or not, like pgstatginindex() does.

I've added both of these checks.  Sorry I overlooked your early email
in this thread about those.

>
> +RelationindexRel = index_open(indexoid, AccessShareLock);
>
> ISTM that AccessShareLock is not safe when updating the pending list and
> GIN index main structure. Probaby we should use RowExclusiveLock?

Other callers of the ginInsertCleanup function also do so while
holding only the AccessShareLock on the index.  It turns out that
there is a bug around this, as discussed in "Potential GIN vacuum bug"
(http://www.postgresql.org/message-id/flat/CAMkU=1xalflhuuohfp5v33rzedlvb5aknnujceum9knbkrb...@mail.gmail.com)

But, that bug has to be solved at a deeper level than this patch.

I've also cleaned up some other conflicts, and chose a more suitable
OID for the new catalog function.

The number of new header includes needed to implement this makes me
wonder if I put this code in the correct file, but I don't see a
better location for it.

New version attached.

Thanks,

Jeff
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
new file mode 100644
index 9c143b2..ada228f
*** a/doc/src/sgml/func.sgml
--- b/doc/src/sgml/func.sgml
*** postgres=# SELECT * FROM pg_xlogfile_nam
*** 18036,18041 
--- 18036,18045 
  brin_summarize_new_values
 
  
+
+ gin_clean_pending_list
+
+ 
 
   shows the functions
  available for index maintenance tasks.
*** postgres=# SELECT * FROM pg_xlogfile_nam
*** 18056,18061 
--- 18060,18072 
 integer
 summarize page ranges not already summarized

+   
+
+ gin_clean_pending_list(index_oid regclass)
+
+bigint
+move fast-update pending list entries into main index structure
+   
   
  
 
*** postgres=# SELECT * FROM pg_xlogfile_nam
*** 18069,18074 
--- 18080,18092 
  into the index.
 
  
+
+gin_clean_pending_list accepts a GIN index OID argument
+ and moves the fast-update entries from the pending list into the
+ main index structure for that index.  It returns the number of pages
+ removed from the pending list.
+
+ 

  

diff --git a/doc/src/sgml/gin.sgml b/doc/src/sgml/gin.sgml
new file mode 100644
index 9eb0b5a..17a5087
*** a/doc/src/sgml/gin.sgml
--- b/doc/src/sgml/gin.sgml
***
*** 734,740 
 from the indexed item). As of PostgreSQL 8.4,
 GIN is capable of postponing much of this work by inserting
 new tuples into a temporary, unsorted list of pending entries.
!When the table is vacuumed, or if the pending list becomes larger than
 , the entries are moved to the
 main GIN data structure using the same bulk insert
 techniques used during initial index creation.  This greatly improves
--- 734,742 
 from the indexed item). As of PostgreSQL 8.4,
 GIN is capable of postponing much of this work by inserting
 new tuples into a temporary, unsorted list of pending entries.
!When the table is vacuumed or autoanalyzed, or when 
!gin_clean_pending_list(regclass) is called, or if the
!pending list becomes larger than
 , the entries are moved to the
 main GIN data structure using the same bulk insert
 techniques used during initial index creation.  This greatly improves
diff --git a/src/backend/access/gin/ginfast.c b/src/backend/access/gin/ginfast.c
new file mode 100644
index 681ce09..dfdc45f
*** a/src/backend/access/gin/ginfast.c
--- b/src/backend/access/gin/ginfast.c
***
*** 20,29 
--- 20,32 
  
  #include "access/gin_private.h"
  #include "access/xloginsert.h"
+ #include "access/xlog.h"
  #include "commands/vacuum.h"
+ #include "catalog/pg_am.h"
  #include "miscadmin.h"
  #include

Re: [HACKERS] Releasing in September

2016-01-24 Thread Greg Stark

On Fri, Jan 22, 2016 at 6:21 PM, Andres Freund  wrote:
> On 2016-01-22 08:50:15 -0600, Jim Nasby wrote:
>> I think that's a great way to ensure we shrink the pool of reviewers when
>> someone works on a patch and then it goes nowhere.
>
> True, it really sucks. But what's your counter proposal? Commitfests
> dragging on forever, and people burning out on continually feeling they
> need to work on them? Hasn't worked out well.

Well my point of view is that the entire reason for commitfests is to
ensure every patch gets at least some feedback. If we're going to
bounce patches without any feedback and only work on the patches that
catchh people's interest then that's back where we were before
commitfests. We used to see a lot of patches sit basically ignored
until release time when Tom or someone would pick them up and rewrite
them from scratch because that would be faster than trying to explain
what's wrong and waiting for the original author to rewrite it. It
really sucks for the original author to be ignored for a year and I
think it's how we lost a lot of potential contributors.

I think it's true that there are basically two successful lanes
patches can be in. a) They're of wide interest and need concerted
ongoing effort by multiple people to proceed and b) they're of narrow
interest but really just need a thumbs up or pointer to what direction
to head and the original author can proceed. The first clogs up the
commitfest and dominates people's time when it probably belongs more
in the normal development process. The latter is really what they were
designed for.

Perhaps it would make sense to alternate between "commitfests" that
are intended to return feedback to authors so they can work on their
patch and "developfests" where larger patches that need more
discussion get covered. The latter are expected to drag on but
shouldn't block other work whereas the commitfests are expected to get
relatively short reviews and be bounced as "returned with feedback"
quickly.

Or perhaps we should just have two different "returned with feedback",
one of which is "under discussion" which means the patch has seen some
discussion and therefore doesn't need to be in the commitfest any more
regardless of whether it actually resolved the issues authoritatively.
Making decisions in a consensus-driven community is just hard and we
could use some lessons in how to say no or how to resolve
irreconcilable conflicts but barring solving those issues it would at
least be nice to remove them from the critical path blocking other
patches and making the process feel interminable for bystanders.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal: Trigonometric functions in degrees

2016-01-24 Thread Tom Lane

Dean Rasheed  writes:
> If I understand correctly there were 2 separate issues at play here:

> 1). On some platforms the compiler evaluates expressions like
> sin(constant) and comes up with a slightly different result than a
> runtime evaluation of the expression. The compiler-evaluated result
> is presumably a 64-bit IEEE float, but at runtime it may be using
> extended precision for intermediate results.

If I've not lost track of the bidding, both of the cases where we saw
that involved gcc on a platform where it's not the native (vendor
supplied) compiler.  So my guess is that gcc was using a non-native
libm to do the pre-evaluation of sin().  It does not seem likely that
the gcc boys would intentionally do pre-evaluation differently from
run-time evaluation, but they could get burnt by what would arguably
be a build error in that particular copy of gcc.  Cross-compilation
would be another way to hit the same type of hazard.

> That may well have been the sole contributing factor to the fact that
> sind(30) wasn't exactly 0.5.

Yeah.  I am not sure whether the RADIANS_PER_DEGREE change fixed anything
or not, though I am not tempted to undo it; it may save us on some other
platform not currently represented in the buildfarm.

> 2). The compiler also sometimes rearranges expressions in ways that
> don't give the same result as evaluating in the order suggested by the
> parentheses. I think this actually explains the failure to get exactly
> 1 for tand(45). For x=45, this was being computed as
>   cosd_0_to_60(90 - x) / cosd_0_to_60(x)
> so my guess is that it was inlining cosd_0_to_60(90 - x) and
> rearranging it to produce something different from cosd_0_to_60(x) for
> x=45.

Oh, interesting point.  The inlining would have produced a subexpression
like

cos((90 - x) * RADIANS_PER_DEGREE)

For x=45, the result of 90-x would have been exact, so it's not obvious
where any change in results would have crept in --- but if the compiler
then tried to simplify to

cos((90 * RADIANS_PER_DEGREE) - (x * RADIANS_PER_DEGREE))

that could definitely change the roundoff behavior.  OTOH, it's not
very clear why gcc would have done that; it saves no operations.
It'd be interesting to look at the produced assembly code on narwhal.

> I wonder if the same could have been achieved by disabling
> optimisation and inlining in those low-level functions, and also
> wrapping sin(x * RADIANS_PER_DEGREE) in a similar non-inlinable,
> non-optimised function to force it to be executed at runtime when
> passed a constant.

I considered that; in particular -ffloat-store would have helped with the
wider-intermediate-results problem, and indeed we might still be forced
into using that.  I would rather not fall back to adding more
compiler-specific flags though.  If we go that way we'll likely need a
custom fix for every new compiler we try to use.

Meanwhile, just when you thought it was safe to go back in the water,
cockatiel is still failing.  It has the cos(60) != 0.5 problem, which
IIRC was exhibited by no other critter.  Looking at the code,

cosd_0_to_60(double x)
{
return 1.0 - ((1.0 - cos(x * RADIANS_PER_DEGREE)) / one_minus_cos_60) / 2.0;
}

what seems likely is that the "1 - cos()" subtraction is being done in
a wider-than-double float register, which i686 does have, and producing
a different result than what was stored in one_minus_cos_60.

Perhaps we can fix this by rewriting as

float8 numerator = 1.0 - cos(x * RADIANS_PER_DEGREE);
return 1.0 - (numerator / one_minus_cos_60) / 2.0;

cockatiel's compiler does recognize -fexcess-precision=standard, and
my understanding of that is that the result put into "numerator" will
be rounded to double width, so that it should then match
"one_minus_cos_60".

Another idea would be to change the cache variable to just "cos_60" and
write "(1.0 - cos_60)" in the denominator --- but then we'd just be hoping
that the compiler does both subtractions the same way, which doesn't seem
necessarily guaranteed.  Worse, I believe the 8087 has an FCOS instruction
which might deliver a wider-than-double result, so that maybe the problem
is not so much with the subtraction as with when rounding of the cos()
result happens.  The code I show above seems more likely to match the
way one_minus_cos_60 is computed.

I'll go try it, though I guess we won't see results till tomorrow.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: pglogical_output - a general purpose logical decoding output plugin

2016-01-24 Thread Andres Freund

On 2016-01-18 21:47:27 +, Tomasz Rybak wrote:
> We might also think about changing name of plugin to something resembling 
> "logical_streaming_decoder" or even "logical_streamer"

FWIW, I find those proposals unconvincing. Not that pglogical_output is
grand, but "streaming decoder" or "logical_streamer" aren't even
correct. And output plugin isn't a "decoder" or a "streamer".

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch: Implement failover on libpq connect level.

2016-01-24 Thread Thom Brown

On 23 January 2016 at 03:32, Thom Brown  wrote:
> On 22 January 2016 at 19:30, Victor Wagner  wrote:
>> On Tue, 19 Jan 2016 14:34:54 +
>> Thom Brown  wrote:
>>
>>>
>>> The segfault issue I originally reported now appears to be resolved.
>>>
>>> But now I have another one:
>>>
>>> psql
>>> 'postgresql://thom@127.0.0.1:5530,127.0.0.1:5531,127.0.0.1:5532,127.0.0.1:5533/postgres?hostorder=random=1_timeout=5'
>>> -c 'show port'
>>
>> Here is new version of the patch. Now I've reworked hostorder=random and
>> it seems to work as well as sequential. failover_timeout works too.
>> I've also found a case when attempt to connect fail when receiving
>> FATAL message from server which is not properly up yet. So, it is fixed
>> too.
>>
>> Addititonally, error messages from all failed connect attempts are not
>> accumulated now. Only last one is returned.
>
> I can't connect to a standby with the patch applied:
>
> thom@swift:~/Development/test$ psql -p 5531 postgres
> psql: thom@swift:~/Development/test$
>
> No error message, nothing in the logs.  I find this is the case with
> any standby, but doesn't affect primaries.
>
> So this has broken existing functionality somewhere.

Okay, I've tested this again with additional logging.  Again, I'm just
running "psql -p 5531 postgres", which connects to a standby.  This
immediately exits psql, and the logs show:

2016-01-24 15:04:59.879 GMT - [unknown] - [unknown] LOG:  0:
connection received: host=[local]
2016-01-24 15:04:59.879 GMT - [unknown] - [unknown] LOCATION:
BackendInitialize, postmaster.c:4081
2016-01-24 15:04:59.880 GMT - thom - postgres LOG:  0: connection
authorized: user=thom database=postgres
2016-01-24 15:04:59.880 GMT - thom - postgres LOCATION:
PerformAuthentication, postinit.c:272
2016-01-24 15:04:59.886 GMT - thom - postgres LOG:  0: statement:
SELECT pg_catalog.pg_is_in_recovery()
2016-01-24 15:04:59.886 GMT - thom - postgres LOCATION:
exec_simple_query, postgres.c:935
2016-01-24 15:04:59.886 GMT - thom - postgres LOG:  0: duration: 0.583 ms
2016-01-24 15:04:59.886 GMT - thom - postgres LOCATION:
exec_simple_query, postgres.c:1164
2016-01-24 15:04:59.886 GMT - thom - postgres LOG:  0:
disconnection: session time: 0:00:00.007 user=thom database=postgres
host=[local]
2016-01-24 15:04:59.886 GMT - thom - postgres LOCATION:
log_disconnections, postgres.c:4458

This shouldn't be checking whether it's a standby.  I also noticed that with:

psql 
'postgresql://thom@127.0.0.1:5530,127.0.0.1:5531,127.0.0.1:5532,127.0.0.1:5533,127.0.0.1:5534,127.0.0.1:5535/postgres?hostorder=random=1'
-c 'show port'

The standby at port 5533 shows in the logs that it's checking whether
it's a standby when it happens to hit it.  Shouldn't this be
unnecessary if we've set "readonly" to 1?  The result of the query
doesn't appear to be useful for anything.

Another thing I've noticed is that the PORT variable (shown by \set)
always shows PGPORT, but I expect it to equal the port of whichever
host we successfully connected to.

Thom


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch: Implement failover on libpq connect level.

2016-01-24 Thread Thom Brown

On 24 January 2016 at 15:30, Thom Brown  wrote:
> On 23 January 2016 at 03:32, Thom Brown  wrote:
>> On 22 January 2016 at 19:30, Victor Wagner  wrote:
>>> On Tue, 19 Jan 2016 14:34:54 +
>>> Thom Brown  wrote:
>>>

 The segfault issue I originally reported now appears to be resolved.

 But now I have another one:

 psql
 'postgresql://thom@127.0.0.1:5530,127.0.0.1:5531,127.0.0.1:5532,127.0.0.1:5533/postgres?hostorder=random=1_timeout=5'
 -c 'show port'
>>>
>>> Here is new version of the patch. Now I've reworked hostorder=random and
>>> it seems to work as well as sequential. failover_timeout works too.
>>> I've also found a case when attempt to connect fail when receiving
>>> FATAL message from server which is not properly up yet. So, it is fixed
>>> too.
>>>
>>> Addititonally, error messages from all failed connect attempts are not
>>> accumulated now. Only last one is returned.
>>
>> I can't connect to a standby with the patch applied:
>>
>> thom@swift:~/Development/test$ psql -p 5531 postgres
>> psql: thom@swift:~/Development/test$
>>
>> No error message, nothing in the logs.  I find this is the case with
>> any standby, but doesn't affect primaries.
>>
>> So this has broken existing functionality somewhere.
>
> Okay, I've tested this again with additional logging.  Again, I'm just
> running "psql -p 5531 postgres", which connects to a standby.  This
> immediately exits psql, and the logs show:
>
> 2016-01-24 15:04:59.879 GMT - [unknown] - [unknown] LOG:  0:
> connection received: host=[local]
> 2016-01-24 15:04:59.879 GMT - [unknown] - [unknown] LOCATION:
> BackendInitialize, postmaster.c:4081
> 2016-01-24 15:04:59.880 GMT - thom - postgres LOG:  0: connection
> authorized: user=thom database=postgres
> 2016-01-24 15:04:59.880 GMT - thom - postgres LOCATION:
> PerformAuthentication, postinit.c:272
> 2016-01-24 15:04:59.886 GMT - thom - postgres LOG:  0: statement:
> SELECT pg_catalog.pg_is_in_recovery()
> 2016-01-24 15:04:59.886 GMT - thom - postgres LOCATION:
> exec_simple_query, postgres.c:935
> 2016-01-24 15:04:59.886 GMT - thom - postgres LOG:  0: duration: 0.583 ms
> 2016-01-24 15:04:59.886 GMT - thom - postgres LOCATION:
> exec_simple_query, postgres.c:1164
> 2016-01-24 15:04:59.886 GMT - thom - postgres LOG:  0:
> disconnection: session time: 0:00:00.007 user=thom database=postgres
> host=[local]
> 2016-01-24 15:04:59.886 GMT - thom - postgres LOCATION:
> log_disconnections, postgres.c:4458
>
> This shouldn't be checking whether it's a standby.  I also noticed that with:
>
> psql 
> 'postgresql://thom@127.0.0.1:5530,127.0.0.1:5531,127.0.0.1:5532,127.0.0.1:5533,127.0.0.1:5534,127.0.0.1:5535/postgres?hostorder=random=1'
> -c 'show port'
>
> The standby at port 5533 shows in the logs that it's checking whether
> it's a standby when it happens to hit it.  Shouldn't this be
> unnecessary if we've set "readonly" to 1?  The result of the query
> doesn't appear to be useful for anything.
>
> Another thing I've noticed is that the PORT variable (shown by \set)
> always shows PGPORT, but I expect it to equal the port of whichever
> host we successfully connected to.

Actually, the same goes for the HOST variable, which is currently
showing the list of hosts:ports.

Output of \set variables without patch:

HOST = '127.0.0.1'
PORT = 
'5530,127.0.0.1:5531,127.0.0.1:5532,127.0.0.1:5533,127.0.0.1:5534,127.0.0.1:5535'

And with patch:

HOST = 
'127.0.0.1:5530,127.0.0.1:5531,127.0.0.1:5532,127.0.0.1:5533,127.0.0.1:5534,127.0.0.1:5535'
PORT = '5488'

They're both wrong, but I'm hoping we can just show the right information here.

Thom


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: pglogical_output - a general purpose logical decoding output plugin

2016-01-24 Thread Tomasz Rybak

The following review has been posted through the commitfest application:
make installcheck-world:  not tested
Implements feature:   not tested
Spec compliant:   not tested
Documentation:not tested

Final part of review:
protocol.txt

+|origin_identifier|signed char[origin_identifier_length]|An origin identifier 
of arbitrary, upstream-application-defined structure. _Should_ be text in the 
same encoding as the upstream database. NULL-terminated. _Should_ be 7-bit 
ASCII.

Does it need NULL-termination when previous field contains length of 
origin_identifier?
Similarly for relation metadata message.



+ metadata message. All consecutive row messages must currently have the same
+ relidentifier. (_Later extensions to add metadata caching will relax these
+ requirements for clients that advertise caching support; see the documentation
+ on metadata messages for more detail_).

Shouldn't this be changed as metadata cache is implemented?




+ |relidentifier|uint32|relidentifier that matches the table metadata message 
sent for this row.
+ (_Not present in BDR, which sends nspname and relname instead_)

and

+ |natts|uint16|Number of fields sent in this tuple part.
+ (_Present in BDR, but meaning significantly different here)_

Is BDR mention relevant here? It was not mentioned anywhere else, and now 
appears
ex machina.



Long quote - but required.


+  Tuple fields
+ 
+ |===
+ |Tuple type|signed char|Identifies the kind of tuple being sent.
+ 
+ |tupleformat|signed char|‘**T**’ (0x54)
+ |natts|uint16|Number of fields sent in this tuple part.
+ (_Present in BDR, but meaning significantly different here)_
+ |[tuple field values]|[composite]|
+ |===
+ 
+ = Tuple tupleformat compatibility
+ 
+ Unrecognised _tupleformat_ kinds are a protocol error for the downstream.
+ 
+  Tuple field value fields
+ 
+ These message parts describe individual fields within a tuple.
+ 
+ There are two kinds of tuple value fields, abbreviated and full. Which is 
being
+ read is determined based on the first field, _kind_.
+ 
+ Abbreviated tuple value fields are nothing but the message kind:
+ 
+ |===
+ |*Message*|*Type/Size*|*Notes*
+ 
+ |kind|signed char| * ‘**n**’ull (0x6e) field
+ |===
+ 
+ Full tuple value fields have a length and datum:
+ 
+ |===
+ |*Message*|*Type/Size*|*Notes*
+ 
+ |kind|signed char| * ‘**i**’nternal binary (0x62) field
+ |length|int4|Only defined for kind = i\|b\|t
+ |data|[length]|Data in a format defined by the table metadata and column 
_kind_.
+ |===
+ 
+ = Tuple field values kind compatibility
+ 
+ Unrecognised field _kind_ values are a protocol error for the downstream. The
+ downstream may not continue processing the protocol stream after this
+ point**.**
+ 
+ The upstream may not send ‘**i**’nternal or ‘**b**’inary format values to the
+ downstream without the downstream negotiating acceptance of such values. The
+ downstream will also generally negotiate to receive type information to use to
+ decode the values. See the section on startup parameters and the startup
+ message for details.

I do not fully get it.
For each tuple we are supposed to have "Tuple type" (which is kind?). Does it
mean that T1 might be sent using "i" kind and T2 sent using "b" kind?
At the same tme we have kind "n" (null) - but it belongs to field level
(one field might be null, not entire tuple).

In other words - do we have "i" and then "T" and then number of attributes,
or "T', then number of attributes, then "i" or "b" or "n" for each of 
attributes?

Also - description of "b" seems missing.






+ Before sending changed rows for a relation, a metadata message for the 
relation
+ must be sent so the downstream knows the namespace, table name, column names,
+ optional column types, etc. A relidentifier field, an arbitrary numeric value
+ unique for that relation on that upstream connection, maps the metadata to
+ following rows.
+ 
+ A client should not assume that relation metadata will be followed immediately
+ (or at all) by rows, since future changes may lead to metadata messages being
+ delivered at other times. Metadata messages may arrive during or between
+ transactions.
+ 
+ The upstream may not assume that the downstream retains more metadata than the
+ one most recent table metadata message. This applies across all tables, so a
+ client is permitted to discard metadata for table x when getting metadata for
+ table y. The upstream must send a new metadata message before sending rows for
+ a different table, even if that metadata was already sent in the same session
+ or even same transaction. _This requirement will later be weakened by the
+ addition of client metadata caching, which will be advertised to the upstream
+ with an output plugin parameter._

This needs reworking while metadata caching is supported




+ |Message type|signed char|‘**S**’ (0x53) - startup
+ |Startup message version|uint8|Value is always “1”.

Value is "1" for the current plugin version. It

Re: [HACKERS] Patch: Implement failover on libpq connect level.

2016-01-24 Thread Victor Wagner

On Sun, 24 Jan 2016 15:30:22 +
Thom Brown  wrote:

в
> Okay, I've tested this again with additional logging.  Again, I'm just
> running "psql -p 5531 postgres", which connects to a standby.  This
> immediately exits psql, and the logs show:

> 2016-01-24 15:04:59.886 GMT - thom - postgres LOG:  0: statement:
> SELECT pg_catalog.pg_is_in_recovery()
> This shouldn't be checking whether it's a standby.  I also noticed
> that with:

This is, of course, incompatibility with previous behavior. Probably,
I should modify this patch, so it would imply readonly flag if only one
host/port pair is specified in the command line.

Now it does check for standby regardless of number of hosts specified.



-- 
   Victor Wagner 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch: Implement failover on libpq connect level.

2016-01-24 Thread Victor Wagner

On Sun, 24 Jan 2016 15:58:10 +
Thom Brown  wrote:

> 
> Output of \set variables without patch:
> 
> HOST = '127.0.0.1'
> PORT =
> '5530,127.0.0.1:5531,127.0.0.1:5532,127.0.0.1:5533,127.0.0.1:5534,127.0.0.1:5535'
> 
> And with patch:
> 
> HOST =
> '127.0.0.1:5530,127.0.0.1:5531,127.0.0.1:5532,127.0.0.1:5533,127.0.0.1:5534,127.0.0.1:5535'
> PORT = '5488'
> 
> They're both wrong, but I'm hoping we can just show the right
> information here.

I think we should show right information here, but it is not so simple.

Problem is that I never keep symbolic representation of individual
host/port pair. And when we connect successfully, we have only struct
sockaddr representation of the it, which contain right IP
address, but doesn't contain symbolic host name.

Moreover, one hostname from connect string can produce more than one
addrinfo structures. For example, on the machines with IPv6 support,
'localhost' hostname is resolved into both 127.0.0.1 IPv4 address and
[::1] IPv6, and produces two records. 

So would do any name, which have both A and  records in DNS. And
nothing prevent domain administrator to put more than one A record for
same hostname into DNS zone.

So, it is just same information which can be retrieved from the backend
via 

select inet_client_addr();
select inet_client_port();

What is really interesting for HOST and PORT variable - it is the name
of host and port number used to make actual connection, as they were
specified in the connect string or service file.

> 
> Thom

-- 
   Victor Wagner 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch: Implement failover on libpq connect level.

2016-01-24 Thread Thom Brown

On 24 January 2016 at 19:53, Victor Wagner  wrote:
> On Sun, 24 Jan 2016 15:58:10 +
> Thom Brown  wrote:
>
>
>>
>> Output of \set variables without patch:
>>
>> HOST = '127.0.0.1'
>> PORT =
>> '5530,127.0.0.1:5531,127.0.0.1:5532,127.0.0.1:5533,127.0.0.1:5534,127.0.0.1:5535'
>>
>> And with patch:
>>
>> HOST =
>> '127.0.0.1:5530,127.0.0.1:5531,127.0.0.1:5532,127.0.0.1:5533,127.0.0.1:5534,127.0.0.1:5535'
>> PORT = '5488'
>>
>> They're both wrong, but I'm hoping we can just show the right
>> information here.
>
> I think we should show right information here, but it is not so simple.
>
> Problem is that I never keep symbolic representation of individual
> host/port pair. And when we connect successfully, we have only struct
> sockaddr representation of the it, which contain right IP
> address, but doesn't contain symbolic host name.
>
> Moreover, one hostname from connect string can produce more than one
> addrinfo structures. For example, on the machines with IPv6 support,
> 'localhost' hostname is resolved into both 127.0.0.1 IPv4 address and
> [::1] IPv6, and produces two records.
>
> So would do any name, which have both A and  records in DNS. And
> nothing prevent domain administrator to put more than one A record for
> same hostname into DNS zone.
>
>
> So, it is just same information which can be retrieved from the backend
> via
>
> select inet_client_addr();
> select inet_client_port();

I think you mean:

select inet_server_addr();
select inet_server_port();

> What is really interesting for HOST and PORT variable - it is the name
> of host and port number used to make actual connection, as they were
> specified in the connect string or service file.

And this is probably not the correct thing for it to report.  The
documentation says "The database server host you are currently
connected to." and "The database server port to which you are
currently connected.", so yeah, I'd expect to see those set to
whatever those 2 functions resolve to.  That being said, if one
connects via a domain socket, those appear to come back blank with
those functions, yet HOST and PORT report the correct information in
those cases (without passing in multiple hosts).  Is that a
pre-existing bug?

Thom


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal: Trigonometric functions in degrees

2016-01-24 Thread Peter Eisentraut

On 1/23/16 4:18 PM, Tom Lane wrote:
> Peter Eisentraut  writes:
>> On 1/23/16 3:05 PM, Tom Lane wrote:
>>> Peter Eisentraut  writes:
 I'm still getting a failure in float8 on OS X after commit 73193d8:
> 
>>> Weird, because my OS X critters are all happy.  Which OS X and compiler
>>> version, exactly?  Any special compile flags?
> 
>> I'm using gcc 4.8.  It passes with the system clang.  So the explanation
>> is probably along the lines of what Noah has described.
> 
> Ah.  Please see if what I just pushed fixes it.

Works now.  Thanks.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] proposal: function parse_ident

2016-01-24 Thread Pavel Stehule

Hi

2016-01-23 16:36 GMT+01:00 Tom Lane :

> Michael Paquier  writes:
> > On Sat, Jan 23, 2016 at 1:25 AM, Marko Tiikkaja  wrote:
> > +  errmsg("identifier contains disallowed chars"),
> > +  errdetail("string \"%s\" is not valid identifier",
> > + text_to_cstring(qualname;
> > Perhaps, "identifier contains not allowed character" is better?
>
> "disallowed" reads better to me.  I agree with expanding "chars" to
> "characters" though.  Also, the errdetail is conveying no actual extra
> detail AFAICS.  I'd go with something like
>
> errmsg("identifier contains disallowed characters: \"%s\"",
>text_to_cstring(qualname)));
>
> regards, tom lane
>
>
>
>
rebased, messages changes per Tom's proposal

Regards

Pavel


>
>
>
>
>
> The errdeta
>
> regards, tom lane
>
>
>
>
> > --
> > Michael
>
>
> > --
> > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> > To make changes to your subscription:
> > http://www.postgresql.org/mailpref/pgsql-hackers
>
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
new file mode 100644
index 9c143b2..f85bfd0
*** a/doc/src/sgml/func.sgml
--- b/doc/src/sgml/func.sgml
***
*** 1778,1783 
--- 1778,1800 

 
  
+  parse_ident
+ 
+ parse_ident(str text,
+[ strictmode boolean DEFAULT true ] )
+
+text[]
+Split qualified identifier to array parts.
+When second parameter is true, then no any chars after last identifier is allowed. When
+second parameter is false, then chars after last identifier are ignored.
+
+parse_ident('"SomeSchema".someTable')
+"SomeSchema,sometable"
+   
+ 
+   
+
+ 
   pg_client_encoding
  
  pg_client_encoding()
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
new file mode 100644
index 923fe58..61d5b80
*** a/src/backend/catalog/system_views.sql
--- b/src/backend/catalog/system_views.sql
*** RETURNS jsonb
*** 965,967 
--- 965,974 
  LANGUAGE INTERNAL
  STRICT IMMUTABLE
  AS 'jsonb_set';
+ 
+ CREATE OR REPLACE FUNCTION
+   parse_ident(str text, strictmode boolean DEFAULT true)
+ RETURNS text[]
+ LANGUAGE INTERNAL
+ STRICT IMMUTABLE
+ AS 'parse_ident';
diff --git a/src/backend/utils/adt/misc.c b/src/backend/utils/adt/misc.c
new file mode 100644
index 6a306f3..a6c3452
*** a/src/backend/utils/adt/misc.c
--- b/src/backend/utils/adt/misc.c
***
*** 21,32 
--- 21,35 
  #include 
  
  #include "access/sysattr.h"
+ #include "access/htup_details.h"
  #include "catalog/catalog.h"
+ #include "catalog/namespace.h"
  #include "catalog/pg_tablespace.h"
  #include "catalog/pg_type.h"
  #include "commands/dbcommands.h"
  #include "funcapi.h"
  #include "miscadmin.h"
+ #include "parser/scansup.h"
  #include "parser/keywords.h"
  #include "postmaster/syslogger.h"
  #include "rewrite/rewriteHandler.h"
***
*** 38,43 
--- 41,47 
  #include "utils/ruleutils.h"
  #include "tcop/tcopprot.h"
  #include "utils/acl.h"
+ #include "utils/array.h"
  #include "utils/builtins.h"
  #include "utils/timestamp.h"
  
*** pg_column_is_updatable(PG_FUNCTION_ARGS)
*** 598,600 
--- 602,752 
  
  	PG_RETURN_BOOL((events & REQ_EVENTS) == REQ_EVENTS);
  }
+ 
+ 
+ /*
+  * This simple parser utility are compatible with lexer implementation,
+  * used only in parse_ident function
+  */
+ static bool
+ is_ident_start(unsigned char c)
+ {
+ 	if (c == '_')
+ 		return true;
+ 	if ((c >= 'a' && c <= 'z') || (c >= 'A' && c <= 'Z'))
+ 		return true;
+ 
+ 	if (c >= 0200 && c <= 0377)
+ 		return true;
+ 
+ 	return false;
+ }
+ 
+ static bool
+ is_ident_cont(unsigned char c)
+ {
+ 	if (c >= '0' && c <= '9')
+ 		return true;
+ 
+ 	return is_ident_start(c);
+ }
+ 
+ /*
+  * parse_ident - parse SQL composed identifier to separate identifiers.
+  * When strict mode is active (second parameter), then any chars after
+  * last identifiers are disallowed.
+  */
+ Datum
+ parse_ident(PG_FUNCTION_ARGS)
+ {
+ 	text		*qualname;
+ 	char		*qualname_str;
+ 	bool		strict_mode;
+ 	ArrayBuildState *astate = NULL;
+ 	char	*nextp;
+ 
+ 	qualname = PG_GETARG_TEXT_PP(0);
+ 	qualname_str = text_to_cstring(qualname);
+ 	strict_mode = PG_GETARG_BOOL(1);
+ 
+ 	nextp = qualname_str;
+ 
+ 	/* skip leading whitespace */
+ 	while (isspace((unsigned char) *nextp))
+ 		nextp++;
+ 
+ 	for (;;)
+ 	{
+ 		char		*curname;
+ 		char		*endp;
+ 		bool		missing_ident;
+ 
+ 		missing_ident = true;
+ 
+ 		if (*nextp == '\"')
+ 		{
+ 			curname = nextp + 1;
+ 			for (;;)
+ 			{
+ endp = strchr(nextp + 1, '\"');
+ if (endp == NULL)
+ 	ereport(ERROR,
+ 		(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+

Re: [HACKERS] easy way of copying regex_t

2016-01-24 Thread Artur Zakirov


Hi all,

I've been working on moving an extension that allows to move the ispell
dictionaries to shared segment. It's almost complete, the last FIXME is
about copying regex_t structure (stored in AFFIX).

According to regex.h the structure is fairly complex and not exactly easy
to understand, so I'd like to know if anyone here already implemented that
or something that may serve the same purpose. Any ideas?

kind regards
Tomas


This message is the reply to the message 
http://www.postgresql.org/message-id/dd02a31fdeffbf5cb24771e34213b40f.squir...@sq.gransy.com

Sorry, I can't reply to it directly. I can't get it from archive.

Thank you for your extension shared_ispell. It is very useful. I have 
got it from https://github.com/tvondra/shared_ispell
With this message I want to send some patch to your repository with 
draft of a code, which allows shared_ispell to copy regex_t.


The main idea of the patch is:
- we doesn't need copy all regex_t structure
- most of fields and structures used only in a compile time
- we need copy structures: guts, colormap, subre, cnfa
- from the subre structure we need only cnfa

colormap represents a directed acyclic graph. cnfa represents a 
nondeterministic finite automaton.


In this patch also was done the following:
- added regression tests
- deleted spell.h and spell.c since they have duplicate code
- added shared_ispell.h which declares some structures
- fix an issue when stopFile can be NULL
- fixed countCMPDAffixes since theoretically could be zero affix
- added copyCMPDAffix

Question to hackers. Can such patch be useful as a PostgreSQL patch to 
Full-Text search? Is it needed?


shared_ispell loads dictionaries into a shared memory. The main benefits 
are:
- saving of memory. Every dictionary is loaded only once. Dictionaries 
are not loaded for each backend. In current version of PostgreSQL 
dictionaires are loaded for each backend where it was requested.
- saving of time. The first load of a dictionary takes much time. With 
this patch dictionaries will be loaded only once.


--
Artur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
*** a/Makefile
--- b/Makefile
***
*** 1,18 
  MODULE_big = shared_ispell
! OBJS = src/shared_ispell.o src/spell.o
  
  EXTENSION = shared_ispell
! DATA = sql/shared_ispell--1.0.0.sql
! MODULES = shared_ispell
  
! CFLAGS=`pg_config --includedir-server`
  
  PG_CONFIG = pg_config
  PGXS := $(shell $(PG_CONFIG) --pgxs)
  include $(PGXS)
! 
! all: shared_ispell.so
! 
! shared_ispell.so: $(OBJS)
! 
! %.o : src/%.c
--- 1,20 
+ # contrib/shared_ispell/Makefile
+ 
  MODULE_big = shared_ispell
! OBJS = src/shared_ispell.o
  
  EXTENSION = shared_ispell
! DATA = sql/shared_ispell--1.1.0.sql
  
! REGRESS = shared_ispell
  
+ ifdef USE_PGXS
  PG_CONFIG = pg_config
  PGXS := $(shell $(PG_CONFIG) --pgxs)
  include $(PGXS)
! else
! subdir = contrib/shared_ispell
! top_builddir = ../..
! include $(top_builddir)/src/Makefile.global
! include $(top_srcdir)/contrib/contrib-global.mk
! endif
*** a/README.md
--- b/README.md
***
*** 13,28  If you need just snowball-type dictionaries, this extension is not
  really interesting for you. But if you really need an ispell
  dictionary, this may save you a lot of resources.
  
- Warning
- ---
- The extension does not yet handle affixes that require full regular
- expressions (regex_t, implemented in regex.h). This is indicated by
- an error when initializing the dictionary.
- 
- Simple affixes and affixes that can be handled by fast regex subset
- (as implemented in regis.h) are handled just fine.
- 
- 
  Install
  ---
  Installing the extension is quite simple, especially if you're on 9.1.
--- 13,18 
*** /dev/null
--- b/expected/shared_ispell.out
***
*** 0 
--- 1,193 
+ CREATE EXTENSION shared_ispell;
+ -- Test ISpell dictionary with ispell affix file
+ CREATE TEXT SEARCH DICTIONARY shared_ispell (
+ Template=shared_ispell,
+ DictFile=ispell_sample,
+ AffFile=ispell_sample
+ );
+ SELECT ts_lexize('shared_ispell', 'skies');
+  ts_lexize 
+ ---
+  {sky}
+ (1 row)
+ 
+ SELECT ts_lexize('shared_ispell', 'bookings');
+ts_lexize
+ 
+  {booking,book}
+ (1 row)
+ 
+ SELECT ts_lexize('shared_ispell', 'booking');
+ts_lexize
+ 
+  {booking,book}
+ (1 row)
+ 
+ SELECT ts_lexize('shared_ispell', 'foot');
+  ts_lexize 
+ ---
+  {foot}
+ (1 row)
+ 
+ SELECT ts_lexize('shared_ispell', 'foots');
+  ts_lexize 
+ ---
+  {foot}
+ (1 row)
+ 
+ SELECT ts_lexize('shared_ispell', 'rebookings');
+ts_lexize
+ 
+  {booking,book}
+ (1 row)
+ 
+ SELECT ts_lexize('shared_ispell', 'rebooking');
+ts_lexize
+ 
+  {booking,book}
+ (1 row)
+ 
+ SELECT ts_lexize('shared_ispell', 'rebook');
+  ts_lexize 
+ ---
+  
+ (1 row)
+ 
+ SELECT

Re: [HACKERS] Patch: Implement failover on libpq connect level.

2016-01-24 Thread Thom Brown

On 24 January 2016 at 20:11, Thom Brown  wrote:
> On 24 January 2016 at 19:53, Victor Wagner  wrote:
>> On Sun, 24 Jan 2016 15:58:10 +
>> Thom Brown  wrote:
>>
>>
>>>
>>> Output of \set variables without patch:
>>>
>>> HOST = '127.0.0.1'
>>> PORT =
>>> '5530,127.0.0.1:5531,127.0.0.1:5532,127.0.0.1:5533,127.0.0.1:5534,127.0.0.1:5535'
>>>
>>> And with patch:
>>>
>>> HOST =
>>> '127.0.0.1:5530,127.0.0.1:5531,127.0.0.1:5532,127.0.0.1:5533,127.0.0.1:5534,127.0.0.1:5535'
>>> PORT = '5488'
>>>
>>> They're both wrong, but I'm hoping we can just show the right
>>> information here.
>>
>> I think we should show right information here, but it is not so simple.
>>
>> Problem is that I never keep symbolic representation of individual
>> host/port pair. And when we connect successfully, we have only struct
>> sockaddr representation of the it, which contain right IP
>> address, but doesn't contain symbolic host name.
>>
>> Moreover, one hostname from connect string can produce more than one
>> addrinfo structures. For example, on the machines with IPv6 support,
>> 'localhost' hostname is resolved into both 127.0.0.1 IPv4 address and
>> [::1] IPv6, and produces two records.
>>
>> So would do any name, which have both A and  records in DNS. And
>> nothing prevent domain administrator to put more than one A record for
>> same hostname into DNS zone.
>>
>>
>> So, it is just same information which can be retrieved from the backend
>> via
>>
>> select inet_client_addr();
>> select inet_client_port();
>
> I think you mean:
>
> select inet_server_addr();
> select inet_server_port();
>
>> What is really interesting for HOST and PORT variable - it is the name
>> of host and port number used to make actual connection, as they were
>> specified in the connect string or service file.
>
> And this is probably not the correct thing for it to report.  The
> documentation says "The database server host you are currently
> connected to." and "The database server port to which you are
> currently connected.", so yeah, I'd expect to see those set to
> whatever those 2 functions resolve to.  That being said, if one
> connects via a domain socket, those appear to come back blank with
> those functions, yet HOST and PORT report the correct information in
> those cases (without passing in multiple hosts).  Is that a
> pre-existing bug?

I've just checked, and can see that this doesn't appear to be a bug.
As per network.c:

/*
 * IP address that the server accepted the connection on (NULL if Unix socket)
 */

and

/*
 * port that the server accepted the connection on (NULL if Unix socket)
 */

Thom


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Removing Functionally Dependent GROUP BY Columns

2016-01-24 Thread David Rowley

On 24 January 2016 at 08:20, Tom Lane  wrote:
> David Rowley  writes:
>> On 23 January 2016 at 12:44, Tom Lane  wrote:
>>> * What you did to join.sql/join.out seems a bit bizarre.  The existing
>>> test case that you modified was meant to test something else, and
>>> conflating this behavior with the pre-existing one (and not documenting
>>> it) is confusing as can be.  A bit more work on regression test cases
>>> seems indicated.
>
>> The history behind that is that at one point during developing the
>> patch that test had started failing due to the group by item being
>> removed therefore allowing the join removal conditions to be met. On
>> testing again with the old test query I see this no longer happens, so
>> I've removed the change, although the expected output still differs
>> due to the group by item being removed.
>
> Hmm ... but ... it seems to me that the test as it stands *should* fail
> after this patch, because once the non-pkey grouping column is removed
> the join removal optimization should apply.  I think we should look a bit
> more closely at what's happening there.
>
> (IOW, I wasn't so much unhappy with the change to that test case as
> that it was being used as the only test case for this new behavior.
> I see you added some new, separate test cases, so that's good; but
> there's something fishy if the existing case doesn't change behavior.)

Thanks for looking at this again.

I've looked into why the join is not removed; since the redundant
GROUP BY columns are removed during planning, and since the outer
query is planned before the sub query, then when the join removal code
checks if the subquery can been removed, the subquery is yet to be
planned, so still contains the 2 GROUP BY items.

Perhaps the useless columns can be removed a bit earlier, perhaps in
parse analysis. I will look into that now.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Removing Functionally Dependent GROUP BY Columns

2016-01-24 Thread Tom Lane

David Rowley  writes:
> I've looked into why the join is not removed; since the redundant
> GROUP BY columns are removed during planning, and since the outer
> query is planned before the sub query, then when the join removal code
> checks if the subquery can been removed, the subquery is yet to be
> planned, so still contains the 2 GROUP BY items.

Hmm ... but why did it get removed in the earlier patch version, then?

> Perhaps the useless columns can be removed a bit earlier, perhaps in
> parse analysis. I will look into that now.

No; doing this in parse analysis will be sufficient reason to reject the
patch.  That would mean adding a not-semantically-necessary dependency on
the pkey to a query when it is stored as a view.  It has to be done at
planning time and no sooner.

It's possible that you could integrate it into some earlier phase of
planning, like prepjointree, but I think that would be messy and likely
not worth it.  I don't see any existing query-tree traversal this could
piggyback on, and I doubt we want to add a new pass just for this.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Removing Functionally Dependent GROUP BY Columns

2016-01-24 Thread David Rowley

On 25 January 2016 at 10:17, Tom Lane  wrote:
> David Rowley  writes:
>> I've looked into why the join is not removed; since the redundant
>> GROUP BY columns are removed during planning, and since the outer
>> query is planned before the sub query, then when the join removal code
>> checks if the subquery can been removed, the subquery is yet to be
>> planned, so still contains the 2 GROUP BY items.
>
> Hmm ... but why did it get removed in the earlier patch version, then?

I'm not sure now, it was months ago. Perhaps I misremembered and only
altered the test because I mistakenly anticipated it would break.

>> Perhaps the useless columns can be removed a bit earlier, perhaps in
>> parse analysis. I will look into that now.
>
> No; doing this in parse analysis will be sufficient reason to reject the
> patch.  That would mean adding a not-semantically-necessary dependency on
> the pkey to a query when it is stored as a view.  It has to be done at
> planning time and no sooner.
>
> It's possible that you could integrate it into some earlier phase of
> planning, like prepjointree, but I think that would be messy and likely
> not worth it.  I don't see any existing query-tree traversal this could
> piggyback on, and I doubt we want to add a new pass just for this.

It seems like a bit of a corner case anyway. Maybe it's fine as is.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] easy way of copying regex_t

2016-01-24 Thread Tom Lane

Artur Zakirov  writes:
> With this message I want to send some patch to your repository with 
> draft of a code, which allows shared_ispell to copy regex_t.

Allowing ispell.c to know that much about regex internals strikes me as
completely unacceptable from a modularity or maintainability standpoint.
If we want to do that, the appropriate thing would be to add a function
to backend/regex/ that copies a regex_t.

However, I'm rather suspicious of the safety of copying a regex_t into
shared memory in the first place.  It contains function pointers, which
we have not historically assumed would be portable between different
backend processes.  And the regex library is old enough to have never
heard of thread safety, so I'm not really sure that it considers the
regex_t structures to be read-only at execution time.

> shared_ispell loads dictionaries into a shared memory. The main benefits 
> are:
> - saving of memory. Every dictionary is loaded only once. Dictionaries 
> are not loaded for each backend. In current version of PostgreSQL 
> dictionaires are loaded for each backend where it was requested.
> - saving of time. The first load of a dictionary takes much time. With 
> this patch dictionaries will be loaded only once.

Where does the shared memory space come from?  It would not take too
many dictionaries to use up whatever slop is available.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Why format() adds double quote?

2016-01-24 Thread Dickson S. Guedes

2016-01-24 8:04 GMT-02:00 Tatsuo Ishii :
>> On Wed, Jan 20, 2016 at 4:20 AM, Pavel Stehule  
>> wrote:
 If we would go this way, question is if we should back patch this or
 not since the patch apparently changes the existing
 behaviors. Comments?  I would think we should not.
>>>
>>> I am sure, so we should not backport this change. This can breaks customer
>>> regress tests - and the current behave isn't 100% correct, but it is safe.
>>
>> Quite.  This is not a bug fix.  It's a behavior change, perhaps for the 
>> better.
>
> Added to the commitfest 2016-03.

Hi,

I gone ahead a little and tested this patch and it works like was
proposed, I agree that it's not a bug fix but a new behavior so -1 for
backport.

While applying patch against master
(1129c2b0ad2732f301f696ae2cf98fb063a4c1f8) it offsets two hunks.

Since format() has regression tests I suggest that one should be added
to cover this. It could worth to add the new behavior to the docs,
since there no explicit example for %I.

I performed the follow tests that works as expected using some Portuguese words:

postgres=# create table test (nome varchar, endereço text, "UF"
varchar(2), título varchar);
CREATE TABLE
Time: 80,769 ms
postgres=# select format('%I', attname) from pg_attribute join
pg_class on (attrelid = oid) where relname = 'test';
  format
--
 "UF"
 cmax
 cmin
 ctid
 endereço
 nome
 tableoid
 título
 xmax
 xmin
(10 rows)

Time: 1,728 ms
postgres=# select format('%I', 'endereco');
  format
--
 endereco
(1 row)

Time: 0,098 ms
postgres=# select format('%I', 'endereço');
  format
--
 endereço
(1 row)

Time: 0,088 ms
postgres=# select format('%I', 'あああ');
 format

 あああ
(1 row)

Time: 0,072 ms
postgres=# select format('%I', 'título');
 format

 título
(1 row)

Time: 0,051 ms
postgres=# select format('%I', 'título e');
   format

 "título e"
(1 row)

Time: 0,051 ms
postgres=# select format('%I', 'título_e');
  format
--
 título_e
(1 row)

Time: 0,051 ms
postgres=# select format('%I', '_título');
 format
-
 _título
(1 row)

Time: 0,047 ms
postgres=# select format('%I', '1_título');
   format

 "1_título"
(1 row)

Time: 0,046 ms


Thank you for this!


Best regards,
-- 
Dickson S. Guedes
mail/xmpp: gue...@guedesoft.net - skype: guediz
http://github.com/guedes - http://guedesoft.net
http://www.postgresql.org.br


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

38 matches

Mail list logo