Re: [GENERAL] regexp_matches where clause - PG 9.1

2017-04-05 Thread Arjen Nienhuis
On Apr 6, 2017 05:57, "Patrick B"  wrote:

Hi guys,

i've got this column:

path_name character varying(255)
>

I store full S3 bucket path for the attachments of my application on it;
example:

/{s3bucket}/filesuser/client/27801123/attachment/4510/main
>
/{s3bucket}/filesuser/client/27801123/attachment/4510/file
>


I wanna do a select, where path_name has only 'main' and not anything else.


WHERE path_nane LIKE '%/main'


Re: [GENERAL] A change in the Debian install

2017-04-05 Thread Adrian Klaver


On 04/05/2017 08:03 PM, rob stone wrote:

Hello,

Postgres is started via pg_ctl and NOT systemd.
Below are the log entries when version 9.6.2-1 is started.

2017-04-04 07:15:27 AESTLOG:  database system was shut down at 2017-04-
03 13:08:27 AEST
2017-04-04 07:15:28 AESTLOG:  MultiXact member wraparound protections
are now enabled
2017-04-04 07:15:28 AESTLOG:  database system is ready to accept
connections
2017-04-04 07:15:28 AESTLOG:  autovacuum launcher started


Upgraded to version 9.6.2-2 and these are the log entries on start-up:-


What repos are you using, the Debian or the PGDG one?

I guess the question I should really ask is, are you using a repo or 
some other method to upgrade?





2017-04-05 08:03:29 AESTLOG:  test message did not get through on
socket for statistics collector
2017-04-05 08:03:29 AESTLOG:  disabling statistics collector for lack
of working socket
2017-04-05 08:03:29 AESTWARNING:  autovacuum not started because of
misconfiguration
2017-04-05 08:03:29 AESTHINT:  Enable the "track_counts" option.
2017-04-05 08:03:29 AESTLOG:  database system was shut down at 2017-04-
04 13:05:46 AEST
2017-04-05 08:03:30 AESTLOG:  MultiXact member wraparound protections
are now enabled
2017-04-05 08:03:30 AESTLOG:  database system is ready to accept
connections

It is ignoring the PGDATA path and obtaining postgresql.conf from
/etc/postgresql/9.6/main.

Removed all the Postgres conf files from the /etc path and it is back
working as intended.


Is this just something the Debian package maintainers have done or do
we have to alter the start-up scripts to specify the config_file
setting?

The doco says "By default, all three configuration files are stored in
the database cluster's data directory." which IMHO means where PGDATA
is pointing.


That is packaging dependent. When using the Debian/Ubuntu 
postgresql-common system the postgresql.conf will be in 
/etc/postgresql/version/cluster_name/




Has anybody else struck this issue?

Cheers,
Rob






--
Adrian Klaver
adrian.kla...@aklaver.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] A change in the Debian install

2017-04-05 Thread Tom Lane
rob stone  writes:
> Upgraded to version 9.6.2-2 and these are the log entries on start-up:-

> 2017-04-05 08:03:29 AESTLOG:  test message did not get through on
> socket for statistics collector

This is not something that would be affected by any Postgres setting;
the stats collector always depends on a loopback TCP connection to
"localhost", and has done so since day one.  If it was working before,
you need to look at what you changed related to network and firewall
configuration.  These symptoms looks like some kernel firewall rule
deciding to block local-loopback TCP traffic, for what that's worth.

(But ... these statements are based on an assumption of out-of-the-
box Postgres behavior.  I would not exactly put it past the Debian
packagers to have decided to change this for reasons of their own,
and their track record of telling us about such decisions is many
miles south of abysmal.  So you might look at whatever patches
are in the Debian package to see if there's anything touching
pgstat.c's socket-setup logic.)

regards, tom lane


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] regexp_matches where clause - PG 9.1

2017-04-05 Thread Patrick B
Hi guys,

i've got this column:

path_name character varying(255)
>

I store full S3 bucket path for the attachments of my application on it;
example:

/{s3bucket}/filesuser/client/27801123/attachment/4510/main
>
/{s3bucket}/filesuser/client/27801123/attachment/4510/file
>


I wanna do a select, where path_name has only 'main' and not anything else.

Maybe using regexp_matches but then how to put it into a where clause?

Thanks!
Patrick.


Re: [GENERAL] Unexpected interval comparison

2017-04-05 Thread Tom Lane
Kyotaro HORIGUCHI  writes:
> By the way the adt directory is, as suggested by the name,
> storing files with names of SQL data types so "int128.c" among
> then seems incongruous. Is "int128_test.c" acceptable? int16.c
> will be placed there in case we support int16 or hugeint on SQL.

After further reflection I've decided to put int128.h in
src/include/common/, thinking that maybe someday it will be useful
on client side too.  Also I've changed the test harness file to
be src/tools/testint128.c, so that it won't be confused with code
meant to be part of the backend.

> Back to 9.5 seems reasonable to me.

I poked around and noticed that before 9.4, we did not attempt
to guard against overflows in interval calculations at all.
So backpatch to 9.4 seems pretty defensible.  The non-HAVE_INT128
code works fine in 9.4.

I've just about finished adjusting the patch for the back
branches, and will push in a little bit.

regards, tom lane


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Unexpected interval comparison

2017-04-05 Thread Kyotaro HORIGUCHI
At Wed, 05 Apr 2017 15:51:10 -0400, Tom Lane  wrote in 
<27982.1491421...@sss.pgh.pa.us>
> Hmm, this still isn't right --- testing shows that you had the comparison
> rule right the first time.

Perhaps Laplaces's deamon is continuously nudging on my head
toward wrong conclusion, sigh. Sorry for bothering you.

At Wed, 05 Apr 2017 17:06:53 -0400, Tom Lane  wrote in 
<385.1491426...@sss.pgh.pa.us>
> I wrote:
> > Looking at what we've got here, it's already a substantial fraction of
> > what would be needed to provide a compiler-independent implementation
> > of the int128-based aggregate logic in numeric.c.  With that in mind,
> > I propose that we extract the relevant stuff into a new header file
> > that is designed as general-purpose int128 support.

+1

> > Something like the
> > attached.  I also attach the test program I put together to verify it.

The new patch seems cleaner and fine to me with maybe-fresh eyes.

Since we have some instances of failure cases on non-native
int128 arithmetic, I'd like to provide regression test for them
but that seems not so simple.

By the way the adt directory is, as suggested by the name,
storing files with names of SQL data types so "int128.c" among
then seems incongruous. Is "int128_test.c" acceptable? int16.c
will be placed there in case we support int16 or hugeint on SQL.

> Here's a fleshed-out patch for the original problem based on that.
> I found that direct int64-to-int128 coercions were also needed to
> handle some of the steps in timestamp.c, so I added those to int128.h.
> 
> I think it would be reasonable to back-patch this, although it would
> need some adjustments for the back branches since we only recently
> got rid of the float-timestamp option.  Also I'd not be inclined to
> depend on native int128 any further back than it was already in use.

Back to 9.5 seems reasonable to me.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] A change in the Debian install

2017-04-05 Thread rob stone
Hello,

Postgres is started via pg_ctl and NOT systemd.
Below are the log entries when version 9.6.2-1 is started.

2017-04-04 07:15:27 AESTLOG:  database system was shut down at 2017-04-
03 13:08:27 AEST
2017-04-04 07:15:28 AESTLOG:  MultiXact member wraparound protections
are now enabled
2017-04-04 07:15:28 AESTLOG:  database system is ready to accept
connections
2017-04-04 07:15:28 AESTLOG:  autovacuum launcher started


Upgraded to version 9.6.2-2 and these are the log entries on start-up:-

2017-04-05 08:03:29 AESTLOG:  test message did not get through on
socket for statistics collector
2017-04-05 08:03:29 AESTLOG:  disabling statistics collector for lack
of working socket
2017-04-05 08:03:29 AESTWARNING:  autovacuum not started because of
misconfiguration
2017-04-05 08:03:29 AESTHINT:  Enable the "track_counts" option.
2017-04-05 08:03:29 AESTLOG:  database system was shut down at 2017-04-
04 13:05:46 AEST
2017-04-05 08:03:30 AESTLOG:  MultiXact member wraparound protections
are now enabled
2017-04-05 08:03:30 AESTLOG:  database system is ready to accept
connections

It is ignoring the PGDATA path and obtaining postgresql.conf from
/etc/postgresql/9.6/main.

Removed all the Postgres conf files from the /etc path and it is back
working as intended.


Is this just something the Debian package maintainers have done or do
we have to alter the start-up scripts to specify the config_file
setting?

The doco says "By default, all three configuration files are stored in
the database cluster's data directory." which IMHO means where PGDATA
is pointing.

Has anybody else struck this issue?

Cheers,
Rob



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] browser interface to forums please?

2017-04-05 Thread George Neuner
On Wed, 5 Apr 2017 14:44:55 -0700, "David G. Johnston"
 wrote:

>A mailing list configuration that will automatically add on the OP to any
>email in a message thread lacking the OP would work-around those list
>respondents who would use "Reply" instead of "Reply All".  Keeping track of
>all respondents and adding them would be something to consider as well.
>
>The above would address the problem of our inability to provide a limited
>engagement channel for people seeking help without forcing them onto the
>-bugs list.

That seems like a good idea.  But having no experience with mailing
list administration or available software, I don't know how easy it
would be to implement.


>A second problem is how to easily allow people to join (both read-only and
>read-write) an ongoing conversation that isn't in their inbox.  Asking for
>a "forum" seems to be expressing a problem of this nature.  I'm deferring
>consideration of this problem-area for some other time.

That's [partly] what digest mailings are for ... to alert people to
interesting discussions they aren't following.  Unfortunately, most
lists don't provide digests by default.

George



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Re: Is this pgbouncer configuration suitable for a production environment with hundreds of databases?

2017-04-05 Thread Lisandro
Thank you very much David for your quick reply, I understand better now.

For now, I'll let default_pool_size=2 and I guess I will have to monitor the
total number of databases and adjust configuration when needed, in order to
avoid reaching the postgres max_connection limit.

I think I can play a bit with some pgbouncer configuration directives, like
pool_size (per database), server_idle_timeout, server_connect_timeout and
idle_transaction_timeout.

Thanks for the help!
Regards,
Lisandro.



--
View this message in context: 
http://www.postgresql-archive.org/Is-this-pgbouncer-configuration-suitable-for-a-production-environment-with-hundreds-of-databases-tp5954479p5954499.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Is this pgbouncer configuration suitable for a production environment with hundreds of databases?

2017-04-05 Thread David G. Johnston
On Wed, Apr 5, 2017 at 3:29 PM, Lisandro  wrote:

>
> The question is: can I set default_pool_size=0


​ISTM parameter would be better named (i.e., mentally remembered as)
"default_maximum_pool_size" ... and a zero for that seems like you'll
render your system inoperable since:

"Notice that I don't set a pool_size for every database, but instead I use
the general default_pool_size.

> *My goal is to find a pgbouncer configuration that allows me to add
> databases without worring about reaching the postgresql max_connection=200
> limit. How can I achieve that?*

The presence of open issue # 103 on GitHub implies that what you want to do
is not possible.

https://github.com/pgbouncer/pgbouncer/issues/103

David J.


[GENERAL] Is this pgbouncer configuration suitable for a production environment with hundreds of databases?

2017-04-05 Thread Lisandro
Hi there!

I'm using pgbouncer in front of a PostgreSQL 9.3 instance.
I have hundreds of databases (almost 200 and counting).
The clients connect to pgbouncer always with the same user (there is one
only user).

Currently I have *postgresql max_connections=200*.
My pgbouncer configuration is this (I put only the directives regarding my
question):

[pgbouncer]
pool_mode = transaction
max_client_conn = 1
default_pool_size = 2
min_pool_size = 0
reserve_pool_size = 2
reserve_pool_timeout = 5
max_db_connections = 5
server_idle_timeout = 30

[databases]
db1 = host=localhost port=6543 dbname=db1
db2...
db3...

Notice that I don't set a pool_size for every database, but instead I use
the general default_pool_size.

Our project is growing, and we are constantly adding new databases, so it's
a matter of time that we reach the number of 300 or 400 databases. Most of
the databases have low activity (websites with very few traffic).

So, considering that we have set postgresql max_connections=200, then
(correct me if I'm wrong) we would be reaching the postgresql limit of
max_connections soon.

The question is: can I set default_pool_size=0 to avoid reaching the
postgres limit? Is that suitable?
*My goal is to find a pgbouncer configuration that allows me to add
databases without worring about reaching the postgresql max_connection=200
limit. How can I achieve that?*

Thanks in advance!
Regards,
Lisandro.



--
View this message in context: 
http://www.postgresql-archive.org/Is-this-pgbouncer-configuration-suitable-for-a-production-environment-with-hundreds-of-databases-tp5954479.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] browser interface to forums please?

2017-04-05 Thread David G. Johnston
On Wed, Apr 5, 2017 at 2:19 PM, George Neuner  wrote:

>
> >So the question is, what is the problem we are trying to solve?
>
> How to support BOTH quick and dirty questions:answers AND complex
> technical discussions that require significant time from their
> participants.
>
>
​The problem of self-interest - people want to be able to ask their
questions and get responses to those questions without being inundated with
lots of other people asking questions that they don't have any intent to
answer.

Our bug reporting mechanism works pretty well in this regard - which is why
people choose to use it instead of -general.  Most (all) respondents on
-bugs will remember to Reply-All which is a critical element of making it
work in a two-way flow.  I don't know how much effort is spent moderating
that list...

A mailing list configuration that will automatically add on the OP to any
email in a message thread lacking the OP would work-around those list
respondents who would use "Reply" instead of "Reply All".  Keeping track of
all respondents and adding them would be something to consider as well.

The above would address the problem of our inability to provide a limited
engagement channel for people seeking help without forcing them onto the
-bugs list.

I don't think that "quick and dirty" vs. "complex" is necessarily addressed
here though likely the vast majority of posts would be of the former
style.  Maybe so much so that directing that traffic to a separate
"-questions" list would let people discriminate their participation between
the different traffic profiles there and on -general.

A second problem is how to easily allow people to join (both read-only and
read-write) an ongoing conversation that isn't in their inbox.  Asking for
a "forum" seems to be expressing a problem of this nature.  I'm deferring
consideration of this problem-area for some other time.

David J.
​


[GENERAL] Archiving data to another server using copy, psql with pipe

2017-04-05 Thread pinker
Hi,
I'm trying to write an archive manager which will be first copying data from
tables with where clause and then, after successful load into second server
- delete them.
The simplest (and probably fastest) solution I came up with is to use copy:
psql -h localhost postgres -c "copy (SELECT * FROM a WHERE time < now()) to
stdout " | psql -h localhost  postgres   -c "copy b from stdin"

I have made very simple test to check if I can be sure about "transactional"
safety. It's not two phase commit of course but it's seems to throw an error
if something went wrong and it's atomic (i assume). The test was:

CREATE TABLE public.a
(
  id integer,
  k01 numeric (3)
);

CREATE TABLE public.b
(
  id integer,
  k01 numeric (1)
);

insert into a select n,n from generate_series(1,100) n;

and then:
psql -h localhost postgres -c "copy a to stdout "|psql -h localhost 
postgres   -c "copy b from stdin"

so psql has thrown an error and no rows were inserted to the b table - so it
seems to be ok.

Is there maybe something I'm missing?
Some specific condition when something could go wrong and make the process
not atomic? (i don't care about data consistency in this particular case).




--
View this message in context: 
http://www.postgresql-archive.org/Archiving-data-to-another-server-using-copy-psql-with-pipe-tp5954469.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] browser interface to forums please?

2017-04-05 Thread George Neuner
On Wed, 5 Apr 2017 10:57:23 -0700, "Joshua D. Drake"
 wrote:

>Stack Overflow (as an example) is a collaboration platform. Stack 
>understands the problem and is very, very good at solving it. It is why 
>they are successful.

Stack Overflow *is* successful ... at driving people away because any
complicated question that could lead to a lengthy discussion gets
closed by the moderators.

Hardly an example of "collaborative" behavior.


>Another example of a very good platform (that I can't stand) is Slack. 
>It has become so completely dominant in the growth space that even 
>Google is changing Hangouts because people were leaving in droves.

Slack is only slightly better.  IRC and other synchronous "rendezvous"
instant messaging methods are great for *simple* questions, but they
are *not* conducive to complex technical discussions.  

If you take time to craft a message [e.g., one lacking spelling or
grammatical errors], to gather information for someone trying to help,
or to try out someone's suggestion, very quickly you find yourself
inundated with "are you still there?" messages.


>So the question is, what is the problem we are trying to solve?

How to support BOTH quick and dirty questions:answers AND complex
technical discussions that require significant time from their
participants.


George



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Unexpected interval comparison

2017-04-05 Thread Tom Lane
I wrote:
> Looking at what we've got here, it's already a substantial fraction of
> what would be needed to provide a compiler-independent implementation
> of the int128-based aggregate logic in numeric.c.  With that in mind,
> I propose that we extract the relevant stuff into a new header file
> that is designed as general-purpose int128 support.  Something like the
> attached.  I also attach the test program I put together to verify it.

Here's a fleshed-out patch for the original problem based on that.
I found that direct int64-to-int128 coercions were also needed to
handle some of the steps in timestamp.c, so I added those to int128.h.

I think it would be reasonable to back-patch this, although it would
need some adjustments for the back branches since we only recently
got rid of the float-timestamp option.  Also I'd not be inclined to
depend on native int128 any further back than it was already in use.

regards, tom lane

diff --git a/src/backend/utils/adt/int128.c b/src/backend/utils/adt/int128.c
index ...8bc0663 .
*** a/src/backend/utils/adt/int128.c
--- b/src/backend/utils/adt/int128.c
***
*** 0 
--- 1,181 
+ /*-
+  *
+  * int128.c
+  *	  Testbed for roll-our-own 128-bit integer arithmetic.
+  *
+  * This file is not meant to be compiled into the backend; rather, it builds
+  * a standalone test program that compares the behavior of an implementation
+  * in int128.h to an (assumed correct) int128 native type.
+  *
+  * Copyright (c) 2017, PostgreSQL Global Development Group
+  *
+  *
+  * IDENTIFICATION
+  *	  src/backend/utils/adt/int128.c
+  *
+  *-
+  */
+ 
+ #include "postgres.h"
+ 
+ /*
+  * By default, we test the non-native implementation in int128.h; but
+  * by predefining USE_NATIVE_INT128 to 1, you can test the native
+  * implementation, just to be sure.
+  */
+ #ifndef USE_NATIVE_INT128
+ #define USE_NATIVE_INT128 0
+ #endif
+ 
+ #include "utils/int128.h"
+ 
+ /*
+  * We assume the parts of this union are laid out compatibly.
+  */
+ typedef union
+ {
+ 	int128		i128;
+ 	INT128		I128;
+ 	union
+ 	{
+ #ifdef WORDS_BIGENDIAN
+ 		int64		hi;
+ 		uint64		lo;
+ #else
+ 		uint64		lo;
+ 		int64		hi;
+ #endif
+ 	}			hl;
+ } test128;
+ 
+ 
+ /*
+  * Control version of comparator.
+  */
+ static inline int
+ my_int128_compare(int128 x, int128 y)
+ {
+ 	if (x < y)
+ 		return -1;
+ 	if (x > y)
+ 		return 1;
+ 	return 0;
+ }
+ 
+ /*
+  * Get a random uint64 value.
+  * We don't assume random() is good for more than 16 bits.
+  */
+ static uint64
+ get_random_uint64(void)
+ {
+ 	uint64		x;
+ 
+ 	x = (uint64) (random() & 0x) << 48;
+ 	x |= (uint64) (random() & 0x) << 32;
+ 	x |= (uint64) (random() & 0x) << 16;
+ 	x |= (uint64) (random() & 0x);
+ 	return x;
+ }
+ 
+ /*
+  * Main program.
+  *
+  * You can give it a loop count if you don't like the default 1B iterations.
+  */
+ int
+ main(int argc, char **argv)
+ {
+ 	long		count;
+ 
+ 	if (argc >= 2)
+ 		count = strtol(argv[1], NULL, 0);
+ 	else
+ 		count = 10;
+ 
+ 	while (count-- > 0)
+ 	{
+ 		int64		x = get_random_uint64();
+ 		int64		y = get_random_uint64();
+ 		int64		z = get_random_uint64();
+ 		test128		t1;
+ 		test128		t2;
+ 
+ 		/* check unsigned addition */
+ 		t1.hl.hi = x;
+ 		t1.hl.lo = y;
+ 		t2 = t1;
+ 		t1.i128 += (int128) (uint64) z;
+ 		int128_add_uint64(, (uint64) z);
+ 
+ 		if (t1.hl.hi != t2.hl.hi || t1.hl.lo != t2.hl.lo)
+ 		{
+ 			printf("%016lX%016lX + unsigned %lX\n", x, y, z);
+ 			printf("native = %016lX%016lX\n", t1.hl.hi, t1.hl.lo);
+ 			printf("result = %016lX%016lX\n", t2.hl.hi, t2.hl.lo);
+ 			return 1;
+ 		}
+ 
+ 		/* check signed addition */
+ 		t1.hl.hi = x;
+ 		t1.hl.lo = y;
+ 		t2 = t1;
+ 		t1.i128 += (int128) z;
+ 		int128_add_int64(, z);
+ 
+ 		if (t1.hl.hi != t2.hl.hi || t1.hl.lo != t2.hl.lo)
+ 		{
+ 			printf("%016lX%016lX + signed %lX\n", x, y, z);
+ 			printf("native = %016lX%016lX\n", t1.hl.hi, t1.hl.lo);
+ 			printf("result = %016lX%016lX\n", t2.hl.hi, t2.hl.lo);
+ 			return 1;
+ 		}
+ 
+ 		/* check multiplication */
+ 		t1.i128 = (int128) x *(int128) y;
+ 
+ 		t2.hl.hi = t2.hl.lo = 0;
+ 		int128_add_int64_mul_int64(, x, y);
+ 
+ 		if (t1.hl.hi != t2.hl.hi || t1.hl.lo != t2.hl.lo)
+ 		{
+ 			printf("%lX * %lX\n", x, y);
+ 			printf("native = %016lX%016lX\n", t1.hl.hi, t1.hl.lo);
+ 			printf("result = %016lX%016lX\n", t2.hl.hi, t2.hl.lo);
+ 			return 1;
+ 		}
+ 
+ 		/* check comparison */
+ 		t1.hl.hi = x;
+ 		t1.hl.lo = y;
+ 		t2.hl.hi = z;
+ 		t2.hl.lo = get_random_uint64();
+ 
+ 		if (my_int128_compare(t1.i128, t2.i128) !=
+ 			int128_compare(t1.I128, t2.I128))
+ 		{
+ 			printf("comparison failure: %d vs %d\n",
+    my_int128_compare(t1.i128, t2.i128),
+    int128_compare(t1.I128, t2.I128));
+ 			printf("arg1 = %016lX%016lX\n", t1.hl.hi, t1.hl.lo);
+ 			printf("arg2 = %016lX%016lX\n", 

Re: [GENERAL] browser interface to forums please?

2017-04-05 Thread George Neuner
On Wed, 5 Apr 2017 11:39:17 -0700, John R Pierce 
wrote:

>On 4/5/2017 11:30 AM, George Neuner wrote:
>> This makes it difficult to follow a discussion via email, and Google's
>> list handling is flawed - it sometimes breaks the underlying list
>> threading [while keeping its own GUI correct], and broken threads can
>> be hard to follow even with a decent news reader.
>
>near as I can tell, gmail ignores the threading headers, and just 
>threads based on subjects.

In my experience it isn't consistent - I suspect Google's server
configurations are not uniform.  I follow a number of lists routinely,
but I see the threading problems only occasionally, and it seems to
follow certain participants.

George



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Is there a point to having both a normal gist index and an exclude index?

2017-04-05 Thread Tom Lane
Bruno Wolff III  writes:
> P.S. Using spgist with version 10 for the exclude index is much faster 
> than using gist in 9.6. I have run the index creation for as long as 
> 6 hours and it hasn't completed with 9.6. It took less than 10 minutes 
> to create it in 10. For this project using 10 isn't a problem and I'll 
> be doing that.

Interesting.  That probably traces back to Emre Hasegeli's work from
last year (commit 77e290682).

regards, tom lane


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Unexpected interval comparison

2017-04-05 Thread Tom Lane
Kyotaro HORIGUCHI  writes:
> The attached patch is the revised version.

Hmm, this still isn't right --- testing shows that you had the comparison
rule right the first time.

Looking at what we've got here, it's already a substantial fraction of
what would be needed to provide a compiler-independent implementation
of the int128-based aggregate logic in numeric.c.  With that in mind,
I propose that we extract the relevant stuff into a new header file
that is designed as general-purpose int128 support.  Something like the
attached.  I also attach the test program I put together to verify it.

On my Fedora 25 laptop, it appears that the hand-rolled implementation
is actually respectably fast compared to gcc's "native" functionality;
the test program runs in ~2m for 1B iterations with the native logic,
and ~2.5m with the hand-rolled logic.  Allowing for overhead and the
fact that we're doing the arithmetic twice, we're probably within 2X
of the native code.  Not bad at all.

I'm not entirely sure what to do with the test program:
1. discard it
2. commit it as utils/adt/int128.c, as suggested in its comment
3. commit it somewhere else, maybe src/tools/.

Thoughts?

regards, tom lane

/*-
 *
 * int128.h
 *Roll-our-own 128-bit integer arithmetic.
 *
 * We make use of the native int128 type if there is one, otherwise
 * implement things the hard way based on two int64 halves.
 *
 * Copyright (c) 2017, PostgreSQL Global Development Group
 *
 * src/include/utils/int128.h
 *
 *-
 */
#ifndef INT128_H
#define INT128_H

/*
 * For testing purposes, use of native int128 can be switched on/off by
 * predefining USE_NATIVE_INT128.
 */
#ifndef USE_NATIVE_INT128
#ifdef HAVE_INT128
#define USE_NATIVE_INT128 1
#else
#define USE_NATIVE_INT128 0
#endif
#endif


#if USE_NATIVE_INT128

typedef int128 INT128;

/*
 * Add an unsigned int64 value into an INT128 variable.
 */
static inline void
int128_add_uint64(INT128 *i128, uint64 v)
{
*i128 += v;
}

/*
 * Add a signed int64 value into an INT128 variable.
 */
static inline void
int128_add_int64(INT128 *i128, int64 v)
{
*i128 += v;
}

/*
 * Add the 128-bit product of two int64 values into an INT128 variable.
 *
 * XXX with a stupid compiler, this could actually be less efficient than
 * the other implementation; maybe we should do it by hand always?
 */
static inline void
int128_add_int64_mul_int64(INT128 *i128, int64 x, int64 y)
{
*i128 += (int128) x *(int128) y;
}

/*
 * Compare two INT128 values, return -1, 0, or +1.
 */
static inline int
int128_compare(INT128 x, INT128 y)
{
if (x < y)
return -1;
if (x > y)
return 1;
return 0;
}

#else   /* !USE_NATIVE_INT128 */

/*
 * We lay out the INT128 structure with the same content and byte ordering
 * that a native int128 type would (probably) have.  This makes no difference
 * for ordinary use of INT128, but allows union'ing INT128 with int128 for
 * testing purposes.
 */
typedef struct
{
#ifdef WORDS_BIGENDIAN
int64   hi; /* most significant 64 
bits, including sign */
uint64  lo; /* least significant 64 
bits, without sign */
#else
uint64  lo; /* least significant 64 
bits, without sign */
int64   hi; /* most significant 64 
bits, including sign */
#endif
} INT128;

/*
 * Add an unsigned int64 value into an INT128 variable.
 */
static inline void
int128_add_uint64(INT128 *i128, uint64 v)
{
/*
 * First add the value to the .lo part, then check to see if a carry 
needs
 * to be propagated into the .hi part.  A carry is needed if both inputs
 * have high bits set, or if just one input has high bit set while the 
new
 * .lo part doesn't.  Remember that .lo part is unsigned; we cast to
 * signed here just as a cheap way to check the high bit.
 */
uint64  oldlo = i128->lo;

i128->lo += v;
if (((int64) v < 0 && (int64) oldlo < 0) ||
(((int64) v < 0 || (int64) oldlo < 0) && (int64) i128->lo >= 0))
i128->hi++;
}

/*
 * Add a signed int64 value into an INT128 variable.
 */
static inline void
int128_add_int64(INT128 *i128, int64 v)
{
/*
 * This is much like the above except that the carry logic differs for
 * negative v.  Ordinarily we'd need to subtract 1 from the .hi part
 * (corresponding to adding the sign-extended bits of v to it); but if
 * there is a carry out of the .lo part, that cancels and we do nothing.
 */
uint64  oldlo = i128->lo;

i128->lo += 

Re: [GENERAL] browser interface to forums please?

2017-04-05 Thread Joshua D. Drake

On 04/05/2017 11:46 AM, Magnus Hagander wrote:

On Wed, Apr 5, 2017 at 7:22 PM, Joshua D. Drake > wrote:




Stackoverflow gives back by providing an interface people want to
use. It is free (as in beer) and is hugely popular.


I think one of the greatest things that Stackoverflow brins isn't
actually the interface (I for one can't stand it, but I'm clearly not
the target group here), but it's the fact that they have the *userbase*
of people. We have a userbase of "people already using postgres and many
of them having done so for some time because there's a threshold to get
over to join this mailinglist thing". Stackoverflow has a userbase that
is orders of magnitude higher, because they provide a venue for people
to ask questions about *anything* -- so they can use the same venue to
ask about their programming language, their framework-du-jour, their
database, their operating system etc etc.


I would agree with that.



This is one reason why I don't think having PostgreSQL dedicated web
forums would actually be very interesting today. Those people who prefer
to use the web as their media are more likely to already be using other
platforms which bring them *more value* than a PostgreSQL dedicated
forum ever would. And they don't have to sing up for Yet Another
Account. And they can work on whatever credit-style-kickback their
favorite platform does.


Which is a reasonable opinion and why my point is more about interfacing 
with those external communities in some positive fashion (vs propping 
our own infrastructure).




We need to be embracing these external communities because it is
where our growth is. I run into people every single week that
absolutely refuse to join these lists. They want nothing to do with
email and they have good reason.



Fully agreed. And I think we're better off doing that than to try to
rebuild our own version of those communities.


+1

Thanks,

JD


--
Command Prompt, Inc.  http://the.postgres.company/
+1-503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Everyone appreciates your honesty, until you are honest with them.
Unless otherwise stated, opinions are my own.


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] browser interface to forums please?

2017-04-05 Thread Magnus Hagander
On Wed, Apr 5, 2017 at 7:22 PM, Joshua D. Drake 
wrote:

> On 03/24/2017 11:45 PM, Tom Lane wrote:
>
>> John R Pierce  writes:
>>
>>> On 3/24/2017 9:49 PM, Yuri Budilov wrote:
>>>
>>
> They are uniformly unfriendly when viewed from this end of the
>> relationship.  nabble for instance reposts stuff into the mailing lists
>> that is missing critical portions.  stackoverflow doesn't seem to think
>> they have any responsibility to give back at all.
>>
>
> Stackoverflow gives back by providing an interface people want to use. It
> is free (as in beer) and is hugely popular.
>

I think one of the greatest things that Stackoverflow brins isn't actually
the interface (I for one can't stand it, but I'm clearly not the target
group here), but it's the fact that they have the *userbase* of people. We
have a userbase of "people already using postgres and many of them having
done so for some time because there's a threshold to get over to join this
mailinglist thing". Stackoverflow has a userbase that is orders of
magnitude higher, because they provide a venue for people to ask questions
about *anything* -- so they can use the same venue to ask about their
programming language, their framework-du-jour, their database, their
operating system etc etc.

This is one reason why I don't think having PostgreSQL dedicated web forums
would actually be very interesting today. Those people who prefer to use
the web as their media are more likely to already be using other platforms
which bring them *more value* than a PostgreSQL dedicated forum ever would.
And they don't have to sing up for Yet Another Account. And they can work
on whatever credit-style-kickback their favorite platform does.



> We need to be embracing these external communities because it is where our
> growth is. I run into people every single week that absolutely refuse to
> join these lists. They want nothing to do with email and they have good
> reason.
>


Fully agreed. And I think we're better off doing that than to try to
rebuild our own version of those communities.

Personally, I couldn't stand going through StackOverflow on a regular basis
trying to check if Postgres related questions are answered or not etc.
Luckily, we have other community members who *are* willing to do that, and
they make that platform work. So I'm very grateful for those people doing
it, even better that it's not me.


And also, if somebody wants to take another stab at trying to make web
forums for PostgreSQL, I say let them try. I don't think it would work, but
I'd be happy to be proven wrong.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: [GENERAL] Query never completes with low work_mem (at least not within one hour)

2017-04-05 Thread Daniel Westermann
Daniel Westermann  writes: 
>> Thank you, Merlin. As said I know that "not in" is not a good choice in this 
>> case but I still do not get what is going here. Why does the >> repeatedly search for NULL values when I decrease work_mem and why not when 
>> increasing work_mem? 

>The core point is that one plan is using a hashed subplan and the other is 
>not, because the planner estimated that the hashtable wouldn't fit into 
>work_mem. With a hashtable you'll have one probe into the hashtable per 
>outer row, and each probe is O(1) unless you are unlucky about data 
>distributions, so the runtime is more or less linear. Without a 
>hashtable, the inner table is rescanned for each outer row, so the 
>runtime is O(N^2) which gets pretty bad pretty fast. "Materializing" 
>the inner table doesn't really help: it gets rid of per-inner-row 
>visibility checks and some buffer locking overhead, so it cuts the 
>constant factor some, but the big-O situation is still disastrous. 

Thanks, Tom 


Re: [GENERAL] browser interface to forums please?

2017-04-05 Thread John R Pierce

On 4/5/2017 11:30 AM, George Neuner wrote:

This makes it difficult to follow a discussion via email, and Google's
list handling is flawed - it sometimes breaks the underlying list
threading [while keeping its own GUI correct], and broken threads can
be hard to follow even with a decent news reader.


near as I can tell, gmail ignores the threading headers, and just 
threads based on subjects.


--
john r pierce, recycling bits in santa cruz



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] browser interface to forums please?

2017-04-05 Thread George Neuner
On Wed, 5 Apr 2017 09:31:59 -0700, Adrian Klaver
 wrote:

>On 04/05/2017 09:17 AM, Magnus Hagander wrote:
>
>> This has been tried a number of times. I'ts been a couple of years since
>> I last saw one, but multiple people have set up forums, either mirrored
>> or not. They have all died because of either lack of usage or because
>> the person who did it disappeared.
>
>Mostly, because they did not work well and the folks on this end of the 
>process had to do more work to get the information necessary to answer 
>the question. I know I eventually stopped responding to the questions 
>from those sources because it was difficult to follow the information 
>flow. Namely you had to crawl back up to the forum to get information 
>and then the email thread had mix of information that made it through on 
>its own and some subset of information that dedicated people pulled in 
>from the forum. That mix depended on dedication level and time available.

That's the same observation I made about list participants who
subscribe through Google Groups  ... they often don't [think to] make
the effort to quote or attribute properly because *they* can simply
look back up the thread to see what was written and by whom.  

This makes it difficult to follow a discussion via email, and Google's
list handling is flawed - it sometimes breaks the underlying list
threading [while keeping its own GUI correct], and broken threads can
be hard to follow even with a decent news reader.
[Postgresql lists are available through NNTP: e.g., at Gmane.org].

YMMV,
George



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Is there a point to having both a normal gist index and an exclude index?

2017-04-05 Thread Bruno Wolff III

On Wed, Apr 05, 2017 at 12:11:09 -0600,
 Rob Sargent  wrote:



On 04/05/2017 12:04 PM, Bruno Wolff III wrote:

On Wed, Apr 05, 2017 at 00:05:31 -0400,
Tom Lane  wrote:

Bruno Wolff III  writes:

... I create both a normal gist index and an exclude index using the
following:
CREATE INDEX contains ON iplocation USING gist (network inet_ops);
ALTER TABLE iplocation
 ADD CONSTRAINT overlap EXCLUDE USING gist (network inet_ops WITH &&);



But I am wondering if it is useful to have the normal gist index for
finding netblocks containing a specific IP address, as it seems 
like the

exclude index should be usable for that as well.


No, that manually-created index is completely redundant with the
constraint index.


Thanks.

P.S. Using spgist with version 10 for the exclude index is much 
faster than using gist in 9.6. I have run the index creation for as 
long as 6 hours and it hasn't completed with 9.6. It took less than 
10 minutes to create it in 10. For this project using 10 isn't a 
problem and I'll be doing that.




That's an incredible difference.  Is it believable? Same resource, etc?


Same data, same load scripts other than spgist replacing gist and pointing 
to the 10 server instead of the 9.6 server.


If gist is scaling at n^2 because of bad splits, then with 3.5M records 
I could see that big of a difference if spgist is n log n. I don't know for 
sure if that was what is really going on. The index creation seems to 
be CPU bound rather than I/O bound as it is pegging a CPU.



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] browser interface to forums please?

2017-04-05 Thread Adrian Klaver

On 04/05/2017 10:57 AM, Joshua D. Drake wrote:

On 04/05/2017 10:45 AM, Adrian Klaver wrote:

On 04/05/2017 10:26 AM, Tim Clarke wrote:



+1 Joshua, that's the best reason I've heard so far and it seems very
powerful to me. The more readers we have and the easier they can
communicate with us (doesn't matter if they are "wrong") then  the
better all round for Postgres.


This implies that ease of communication = quality of communication and I
am not buying it. Exhibit A, Twitter.


Adrian,

I am afraid that you misunderstand the problem. The idea that you would
use Twitter as the example is a perfect illustration of this. Twitter is
*not* a collaboration platform. It is a promotion platform and it does
it very well.


More a commentary on ease of use trumping content.



Stack Overflow (as an example) is a collaboration platform. Stack
understands the problem and is very, very good at solving it. It is why
they are successful.

Another example of a very good platform (that I can't stand) is Slack.
It has become so completely dominant in the growth space that even
Google is changing Hangouts because people were leaving in droves.

So the question is, what is the problem we are trying to solve?

I posed that the problem is that we are hostile toward communities that
don't communicate and collaborate in the way we feel is "correct".


Not being hostile to other communities, just pointing out the reality. 
When push comes to shove and the other methods of collaboration fail, 
said communities then refer the OP to this list or others in the 
Postgres mailing list galaxy to get an answer. Seems to me folks would 
save a lot of time and effort just joining the list instead of hopping 
around looking for a quick answer. Reminds me of college where I saw 
people spend more time looking for the 'cheats' instead of just studying 
the material. The material is here, the people that know the material 
are here, get over your bias and use the resource.




You have proved that the way I posed the problem is accurate.

Thank you for your participation,


JD






--
Adrian Klaver
adrian.kla...@aklaver.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Is there a point to having both a normal gist index and an exclude index?

2017-04-05 Thread Rob Sargent



On 04/05/2017 12:04 PM, Bruno Wolff III wrote:

On Wed, Apr 05, 2017 at 00:05:31 -0400,
 Tom Lane  wrote:

Bruno Wolff III  writes:

... I create both a normal gist index and an exclude index using the
following:
CREATE INDEX contains ON iplocation USING gist (network inet_ops);
ALTER TABLE iplocation
  ADD CONSTRAINT overlap EXCLUDE USING gist (network inet_ops WITH &&);



But I am wondering if it is useful to have the normal gist index for
finding netblocks containing a specific IP address, as it seems like 
the

exclude index should be usable for that as well.


No, that manually-created index is completely redundant with the
constraint index.


Thanks.

P.S. Using spgist with version 10 for the exclude index is much faster 
than using gist in 9.6. I have run the index creation for as long as 6 
hours and it hasn't completed with 9.6. It took less than 10 minutes 
to create it in 10. For this project using 10 isn't a problem and I'll 
be doing that.




That's an incredible difference.  Is it believable? Same resource, etc?



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Is there a point to having both a normal gist index and an exclude index?

2017-04-05 Thread Bruno Wolff III

On Wed, Apr 05, 2017 at 00:05:31 -0400,
 Tom Lane  wrote:

Bruno Wolff III  writes:

... I create both a normal gist index and an exclude index using the
following:
CREATE INDEX contains ON iplocation USING gist (network inet_ops);
ALTER TABLE iplocation
  ADD CONSTRAINT overlap EXCLUDE USING gist (network inet_ops WITH &&);



But I am wondering if it is useful to have the normal gist index for
finding netblocks containing a specific IP address, as it seems like the
exclude index should be usable for that as well.


No, that manually-created index is completely redundant with the
constraint index.


Thanks.

P.S. Using spgist with version 10 for the exclude index is much faster 
than using gist in 9.6. I have run the index creation for as long as 
6 hours and it hasn't completed with 9.6. It took less than 10 minutes 
to create it in 10. For this project using 10 isn't a problem and I'll 
be doing that.



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] browser interface to forums please?

2017-04-05 Thread Joshua D. Drake

On 04/05/2017 10:45 AM, Adrian Klaver wrote:

On 04/05/2017 10:26 AM, Tim Clarke wrote:



+1 Joshua, that's the best reason I've heard so far and it seems very
powerful to me. The more readers we have and the easier they can
communicate with us (doesn't matter if they are "wrong") then  the
better all round for Postgres.


This implies that ease of communication = quality of communication and I
am not buying it. Exhibit A, Twitter.


Adrian,

I am afraid that you misunderstand the problem. The idea that you would 
use Twitter as the example is a perfect illustration of this. Twitter is 
*not* a collaboration platform. It is a promotion platform and it does 
it very well.


Stack Overflow (as an example) is a collaboration platform. Stack 
understands the problem and is very, very good at solving it. It is why 
they are successful.


Another example of a very good platform (that I can't stand) is Slack. 
It has become so completely dominant in the growth space that even 
Google is changing Hangouts because people were leaving in droves.


So the question is, what is the problem we are trying to solve?

I posed that the problem is that we are hostile toward communities that 
don't communicate and collaborate in the way we feel is "correct".


You have proved that the way I posed the problem is accurate.

Thank you for your participation,


JD



--
Command Prompt, Inc.  http://the.postgres.company/
+1-503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Everyone appreciates your honesty, until you are honest with them.
Unless otherwise stated, opinions are my own.


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] browser interface to forums please?

2017-04-05 Thread Vincent Elschot



Op 05/04/2017 om 19:26 schreef Tim Clarke:

On 05/04/17 18:22, Joshua D. Drake wrote:

Stackoverflow gives back by providing an interface people want to use.
It is free (as in beer) and is hugely popular.

We need to be embracing these external communities because it is where
our growth is. I run into people every single week that absolutely
refuse to join these lists. They want nothing to do with email and
they have good reason.

JD


+1 Joshua, that's the best reason I've heard so far and it seems very
powerful to me. The more readers we have and the easier they can
communicate with us (doesn't matter if they are "wrong") then  the
better all round for Postgres.

Tim



+1 This is what I was getting at before. I need to learn to phrase 
things better :-)




--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] browser interface to forums please?

2017-04-05 Thread Adrian Klaver

On 04/05/2017 10:26 AM, Tim Clarke wrote:


On 05/04/17 18:22, Joshua D. Drake wrote:

Stackoverflow gives back by providing an interface people want to use.
It is free (as in beer) and is hugely popular.

We need to be embracing these external communities because it is where
our growth is. I run into people every single week that absolutely
refuse to join these lists. They want nothing to do with email and
they have good reason.

JD



+1 Joshua, that's the best reason I've heard so far and it seems very
powerful to me. The more readers we have and the easier they can
communicate with us (doesn't matter if they are "wrong") then  the
better all round for Postgres.


This implies that ease of communication = quality of communication and I 
am not buying it. Exhibit A, Twitter.




Tim




--
Adrian Klaver
adrian.kla...@aklaver.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] browser interface to forums please?

2017-04-05 Thread Joshua D. Drake

On 03/24/2017 09:49 PM, Yuri Budilov wrote:

Hello everyone
Can these forums be moved to internet ?
All these emails is so 1990s.
So hard to follow, so hard to search for historical answers.
We really need to be able to post via browser.

best regards to everyone



You are going to find that the .Org community is generally hostile 
toward non-email centric communication. My recommendation is to look 
toward these very useful external communities:


* https://plus.google.com/communities/116371937400081693174
* https://www.facebook.com/groups/postgres/
* http://www.stackoverflow.com/
* http://reddit.com/r/postgresql

If you would like a highly dynamic environment, you may try IRC. Yes it 
is an old school protocol but over 1000 people hangout on that channel 
and there are a lot of them that try to help.


IRC:
* Server: irc.freenode.net
* Channel: #postgresql

Other collaborative platforms:

* https://gitter.im/postgresmen/postgresql

If you are looking for an awesome interface to the IRC channel, you can 
try Riot:


* https://riot.im/app/#/room/#freenode_#postgresql:matrix.org


Thanks for trying to participate,

JD

--
Command Prompt, Inc.  http://the.postgres.company/
+1-503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Everyone appreciates your honesty, until you are honest with them.
Unless otherwise stated, opinions are my own.


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Advise on primary key for detail tables (OS: Raspberry Pi)

2017-04-05 Thread George Neuner
On Wed, 5 Apr 2017 18:04:39 +0300, Ertan Küçüko?lu
 wrote:

>I have a project which will be mainly built on Raspberry Pi and some parts
>on Windows.
>
>I must have a PostgreSQL 9.4.10 running on Raspberry Pi and another
>PostgreSQL running on Windows. Though, there is still a possibility that
>Windows database server will be something else that is not known to me, yet.
>Since Raspberry Pi is running on a SD Card, data saved on Raspberry Pi will
>be copied over to Windows database system for a proper backup & disaster
>recovery.
>
>I need to keep database server overhead as low as possible on Raspberry Pi
>system. That is because software that will be a running is going to do some
>time essential sensor communication.

Umm ... in my opinion, Postgresql is not a good choice for an embedded
database, and particularly for a low power platform like Pi.

Postgresql is a heavyweight DBMS: it uses process based parallelism
[not threads], its memory requirements are higher than some competing
platforms, and it requires (at least periodic) administration to keep
it running well.

You should examine whether you *really* need a DBMS at all or only
need some kind of structured storage.  Then consider whether you
really need a *server* based solution, or if you really only need
shared files.

If you want to stay with SQL for a common access language, then see if
SQLlite, MySQL Embedded, or Firebird Embedded will do the job.

If you only need structured storage and are willing to give up SQL,
then there are many other choices available.


>I am about to start table designs on Raspberry Pi. There is one
>master-detail-detail-detail structure I should implement. Master having
>serial, uuid and some varchar fields. Uuid field being primary key. Details
>have serial, uuid and some smallint fields.
>
>I recall that it is "generally" advised to have a primary key on any table
>used on a database server.

You don't need a "primary" key column per se ... but for almost all
purposes you *DO* need a unique key - which may be a combination of
columns - that can distinguish individual rows in the table.


>My question is: Is reading performance will be faster, if I remove primary
>key on serial fields of detail tables and use a regular index put on master
>table link fields only? In another words, is it advisable *not* to have a
>primary key on PostgreSQL table?

In Postgresql, there is nothing special about a "primary" key index.
And having one won't necessarily improve speed - Postgresql decides
whether or not to even use an index based on many factors such as the
size of the table, whether the index is applicable to the query, how
many rows are expected to be returned [or changed], how much memory is
available, etc.


>If answer changes according to OS underlying, I appreciate replies indicates
>so.

Windows is somewhat less efficient than Unix/Linux at sharing library
code (DLLs) between/among processes.  Postgresql starts several admin
processes to begin with, and then starts a new process for each client
connection.


>Thanks & regards,
>Ertan Küçüko?lu

George



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] browser interface to forums please?

2017-04-05 Thread Tim Clarke

On 05/04/17 18:22, Joshua D. Drake wrote:
> Stackoverflow gives back by providing an interface people want to use.
> It is free (as in beer) and is hugely popular.
>
> We need to be embracing these external communities because it is where
> our growth is. I run into people every single week that absolutely
> refuse to join these lists. They want nothing to do with email and
> they have good reason.
>
> JD
>

+1 Joshua, that's the best reason I've heard so far and it seems very
powerful to me. The more readers we have and the easier they can
communicate with us (doesn't matter if they are "wrong") then  the
better all round for Postgres.

Tim



smime.p7s
Description: S/MIME Cryptographic Signature


Re: [GENERAL] browser interface to forums please?

2017-04-05 Thread Joshua D. Drake

On 03/24/2017 11:45 PM, Tom Lane wrote:

John R Pierce  writes:

On 3/24/2017 9:49 PM, Yuri Budilov wrote:



They are uniformly unfriendly when viewed from this end of the
relationship.  nabble for instance reposts stuff into the mailing lists
that is missing critical portions.  stackoverflow doesn't seem to think
they have any responsibility to give back at all.


Stackoverflow gives back by providing an interface people want to use. 
It is free (as in beer) and is hugely popular.


We need to be embracing these external communities because it is where 
our growth is. I run into people every single week that absolutely 
refuse to join these lists. They want nothing to do with email and they 
have good reason.


JD

--
Command Prompt, Inc.  http://the.postgres.company/
+1-503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Everyone appreciates your honesty, until you are honest with them.
Unless otherwise stated, opinions are my own.


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] browser interface to forums please?

2017-04-05 Thread John McKown
On Wed, Apr 5, 2017 at 11:51 AM, Steve Litt 
wrote:

> ​
>
>
> In addition, once you subscribe to a mailing list, all info comes to
> you. No password necessary. Read, reply, lightning quick.
>
> Contrast this with forums, where you have to remember to go out to each
> and every forum you're interested in, put in the password, and then
> operate within the work-flow of the forum.
>
> I'm subscribed to mailing lists of 20 LUGs. Can you imagine the
> inconvenience if I had to go out to each one and put in a password just
> to see if there's anything new? With mailing lists, the information
> comes to you, instead of making you go out to it.
>

​This is exactly my thought on the subject, but phrased better that I could
have done it. I don't mind an "archive" web site which records all of the
emails. And it it properly threads them, that is even better. I have that
on a number of my lists. And you can even post through them. The post goes
directly to the web site, which then "fakes up" an "email" which looks like
it came in via the regular email channel and sends it back out via the
normal email channel.​ But there are some which don't do this "echoing". In
those, I end up just ignoring people because it is too difficult for me to
bother with going to the site to reply.



>
> SteveT
>
> Steve Litt
> April 2017 featured book: Troubleshooting Techniques
>  of the Successful Technologist
> http://www.troubleshooters.com/techniques
>
>

-- 
"Irrigation of the land with seawater desalinated by fusion power is
ancient. It's called 'rain'." -- Michael McClary, in alt.fusion

Maranatha! <><
John McKown


Re: [GENERAL] How does hot_standby_feedback work

2017-04-05 Thread Scott Marlowe
On Wed, Apr 5, 2017 at 3:37 AM, Александр Киселев  wrote:
> Hello!
>
> My name is Alexander. I am an administarator PostgreSQL.
> I am studying PostgreSQL's 9.6 documentation.
> I am interested in parameter hot_standby_feedback in postgresql.conf
> Can you explain more detail than in documentation for what this parameter is
> used?
> In what situations the parameter can show itself?
> How to test influence of the parameter on database's behavour.

It's possible for a long running query on a slave to live past the
time the tuples it needs exist on the master. This is because the
transaction the read only query is running under cannot "lock" those
tuples against vacuum on the master. The master gets vacuumed, those
tuples go away, the streaming / wal log replication gets relayed to
the read only slave, it loses the tuples and your query fails because
it can no longer access the snapshot of those tuples it started under.

With feedback, the slave can tell the master "I'm using these tuples,
please don't delete them just yet" and the master will skip vacuuming
them up.

It's darned useful for long running queries on a slave.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] browser interface to forums please?

2017-04-05 Thread Steve Litt
On Wed, 5 Apr 2017 09:31:59 -0700
Adrian Klaver  wrote:

> On 04/05/2017 09:17 AM, Magnus Hagander wrote:
> 
> >
> > This has been tried a number of times. I'ts been a couple of years
> > since I last saw one, but multiple people have set up forums,
> > either mirrored or not. They have all died because of either lack
> > of usage or because the person who did it disappeared.  
> 
> Mostly, because they did not work well and the folks on this end of
> the process had to do more work to get the information necessary to
> answer the question. I know I eventually stopped responding to the
> questions from those sources because it was difficult to follow the
> information flow. Namely you had to crawl back up to the forum to get
> information and then the email thread had mix of information that
> made it through on its own and some subset of information that
> dedicated people pulled in from the forum. That mix depended on
> dedication level and time available.

In addition, once you subscribe to a mailing list, all info comes to
you. No password necessary. Read, reply, lightning quick.

Contrast this with forums, where you have to remember to go out to each
and every forum you're interested in, put in the password, and then
operate within the work-flow of the forum.

I'm subscribed to mailing lists of 20 LUGs. Can you imagine the
inconvenience if I had to go out to each one and put in a password just
to see if there's anything new? With mailing lists, the information
comes to you, instead of making you go out to it.

SteveT

Steve Litt 
April 2017 featured book: Troubleshooting Techniques
 of the Successful Technologist
http://www.troubleshooters.com/techniques


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] browser interface to forums please?

2017-04-05 Thread Magnus Hagander
On Wed, Apr 5, 2017 at 6:31 PM, Adrian Klaver 
wrote:

> On 04/05/2017 09:17 AM, Magnus Hagander wrote:
>
>
>> This has been tried a number of times. I'ts been a couple of years since
>> I last saw one, but multiple people have set up forums, either mirrored
>> or not. They have all died because of either lack of usage or because
>> the person who did it disappeared.
>>
>
> Mostly, because they did not work well and the folks on this end of the
> process had to do more work to get the information necessary to answer the
> question. I know I eventually stopped responding to the questions from
> those sources because it was difficult to follow the information flow.
> Namely you had to crawl back up to the forum to get information and then
> the email thread had mix of information that made it through on its own and
> some subset of information that dedicated people pulled in from the forum.
> That mix depended on dedication level and time available.
>
>
Most likely things like that yes -- which indicates that it really wasn't
enough people who preferred that format to actually reach critical mass.


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: [GENERAL] browser interface to forums please?

2017-04-05 Thread Adrian Klaver

On 04/05/2017 09:17 AM, Magnus Hagander wrote:



This has been tried a number of times. I'ts been a couple of years since
I last saw one, but multiple people have set up forums, either mirrored
or not. They have all died because of either lack of usage or because
the person who did it disappeared.


Mostly, because they did not work well and the folks on this end of the 
process had to do more work to get the information necessary to answer 
the question. I know I eventually stopped responding to the questions 
from those sources because it was difficult to follow the information 
flow. Namely you had to crawl back up to the forum to get information 
and then the email thread had mix of information that made it through on 
its own and some subset of information that dedicated people pulled in 
from the forum. That mix depended on dedication level and time available.




Not saying it can't be tried again, but in the previous attempts it
certainly hasn't been "many users". It could be that whomever set them
up did a bad job of course, I can't judge that as I didn't personally
use them more than take a look every now and then.

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/



--
Adrian Klaver
adrian.kla...@aklaver.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] browser interface to forums please?

2017-04-05 Thread Magnus Hagander
On Wed, Apr 5, 2017 at 4:14 PM, vinny  wrote:

> On 2017-04-05 15:11, Vincent Veyron wrote:
>
>> On Tue, 04 Apr 2017 12:01:24 +0200
>> vinny  wrote:
>>
>>>
>>> Every time I tell someone about the mailinglists I then have to explain
>>> how they can subscribe, how to create folders, filters etc. And more
>>> often than not
>>> they just say forget it and go to some forum.
>>>
>>
>> On forums, all you see is the header for the discussion, and the
>> number of messages attached to it.
>>
>> It makes it much more difficult to follow discussions, because you
>> don't know if there are new messages or not, unless you memorized how
>> many were there the last time you looked at it. And even then, you
>> can't tell whether you even read them previously or not, which a
>> mailing list will tell you, because the messages are marked.
>>
>
> It depends entirely on which forum software you use.
> If keeping track of read messages is a requirement then you would obviously
> use a forum that does that for you.
>
> But again, I'm not saying the mailinglist should be replaced by a forum.
> What I'm saying is that many users find forums a lot easier to use and
> give the choice,
> they will opt for the forum. Hence it makes sense to provide something for
> those users,
> if there is the manpower to do so.


This has been tried a number of times. I'ts been a couple of years since I
last saw one, but multiple people have set up forums, either mirrored or
not. They have all died because of either lack of usage or because the
person who did it disappeared.

Not saying it can't be tried again, but in the previous attempts it
certainly hasn't been "many users". It could be that whomever set them up
did a bad job of course, I can't judge that as I didn't personally use them
more than take a look every now and then.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: [GENERAL] Advise on primary key for detail tables (OS: Raspberry Pi)

2017-04-05 Thread Adrian Klaver

On 04/05/2017 08:04 AM, Ertan Küçükoğlu wrote:

Hello,

I have a project which will be mainly built on Raspberry Pi and some parts
on Windows.

I must have a PostgreSQL 9.4.10 running on Raspberry Pi and another
PostgreSQL running on Windows. Though, there is still a possibility that
Windows database server will be something else that is not known to me, yet.
Since Raspberry Pi is running on a SD Card, data saved on Raspberry Pi will
be copied over to Windows database system for a proper backup & disaster
recovery.

I need to keep database server overhead as low as possible on Raspberry Pi
system. That is because software that will be a running is going to do some
time essential sensor communication.

I am about to start table designs on Raspberry Pi. There is one
master-detail-detail-detail structure I should implement. Master having
serial, uuid and some varchar fields. Uuid field being primary key. Details
have serial, uuid and some smallint fields.


So what the serial column in the master table for?



I recall that it is "generally" advised to have a primary key on any table
used on a database server.



What is advised is to have some way of determining uniqueness for a row. 
A PK is the simplest way of doing that, also many ORMs will not work 
without one. Now a PK can be a single value such as the serial column in 
your details tables or it can be over multiple columns that determine 
uniqueness. Again you have to be aware of what the application/interface 
that is using the tables is capable of. In the case of ORMs, they often 
do not understand multi--column PKs. This is why PKs on a 
auto-incrementing(serial) integer are often recommended.



My question is: Is reading performance will be faster, if I remove primary
key on serial fields of detail tables and use a regular index put on master
table link fields only? In another words, is it advisable *not* to have a
primary key on PostgreSQL table?

If answer changes according to OS underlying, I appreciate replies indicates
so.

Thanks & regards,
Ertan Küçükoğlu






--
Adrian Klaver
adrian.kla...@aklaver.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] expensive function in select list vs limit clause

2017-04-05 Thread Chris Mair




ORDER BY can only be processed after all rows have been fetched, this
includes the expensive result column.

You can easily avoid that by applying the LIMIT first:

  SELECT r, expensive()
  FROM (SELECT r
FROM big
ORDER BY r
LIMIT 10
   ) inner;

I don't know how hard it would be to only fetch the necessary columns before
the ORDER BY and fetch the others after the LIMIT has been applied, but it
is probably nontrivial and would require processing time for *everybody*
who runs a query with ORDER BY to solve a rare problem that can easily be
worked around.


Hi,

Tom Lane just pointed out that 9.6 is able to optimise this (at least
the synthetic example).

Anyway, my real problem could be beautifully improved by subselect-trick!

Thanks a lot!

Bye,
Chris.






--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] expensive function in select list vs limit clause

2017-04-05 Thread Chris Mair



https://git.postgresql.org/gitweb/?p=postgresql.git=commitdiff=9118d03a8


Hi,

thanks!

I've just tested with 9.6 and the test runs fast with or without expensive().

So the above patch does indeed improve this case a lot!

Bye,
Chris.





--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Advise on primary key for detail tables (OS: Raspberry Pi)

2017-04-05 Thread Ertan Küçükoğlu
Hello,

I have a project which will be mainly built on Raspberry Pi and some parts
on Windows.

I must have a PostgreSQL 9.4.10 running on Raspberry Pi and another
PostgreSQL running on Windows. Though, there is still a possibility that
Windows database server will be something else that is not known to me, yet.
Since Raspberry Pi is running on a SD Card, data saved on Raspberry Pi will
be copied over to Windows database system for a proper backup & disaster
recovery.

I need to keep database server overhead as low as possible on Raspberry Pi
system. That is because software that will be a running is going to do some
time essential sensor communication.

I am about to start table designs on Raspberry Pi. There is one
master-detail-detail-detail structure I should implement. Master having
serial, uuid and some varchar fields. Uuid field being primary key. Details
have serial, uuid and some smallint fields.

I recall that it is "generally" advised to have a primary key on any table
used on a database server.

My question is: Is reading performance will be faster, if I remove primary
key on serial fields of detail tables and use a regular index put on master
table link fields only? In another words, is it advisable *not* to have a
primary key on PostgreSQL table?

If answer changes according to OS underlying, I appreciate replies indicates
so.

Thanks & regards,
Ertan Küçükoğlu



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] browser interface to forums please?

2017-04-05 Thread Adrian Klaver

On 04/05/2017 07:14 AM, vinny wrote:

On 2017-04-05 15:11, Vincent Veyron wrote:

On Tue, 04 Apr 2017 12:01:24 +0200
vinny  wrote:


Every time I tell someone about the mailinglists I then have to explain
how they can subscribe, how to create folders, filters etc. And more
often than not
they just say forget it and go to some forum.


On forums, all you see is the header for the discussion, and the
number of messages attached to it.

It makes it much more difficult to follow discussions, because you
don't know if there are new messages or not, unless you memorized how
many were there the last time you looked at it. And even then, you
can't tell whether you even read them previously or not, which a
mailing list will tell you, because the messages are marked.


It depends entirely on which forum software you use.
If keeping track of read messages is a requirement then you would obviously
use a forum that does that for you.

But again, I'm not saying the mailinglist should be replaced by a forum.
What I'm saying is that many users find forums a lot easier to use and
give the choice,
they will opt for the forum. Hence it makes sense to provide something
for those users,
if there is the manpower to do so.


Can you expect Joe
Average to do something like that
if they want to get more involved in PgSQL?



How hard is it to subscribe, create a folder and a filter? If that is
too involved, I don't see how they can get involved in postgres
anyway.


That might be true if you are talking about contributors, sure, but
we're not.
Or at least, I'm not, and I guess that's where I'm mistaking. Perhaps
the mailinglists
are the way they are to encourage the more serious users to use them,
and keep everyday questions out a little.


Everyday questions appear here all the time, so that is not the 
motivation. It is more that Postgres is a complex piece of software and 
questions/answers work better in the mailing list work flow then a 
forum. The added benefit is that the list is a tremendous resource and 
you will find many of your questions answered by just monitoring the 
list. There are days that the --general is really busy, but most times 
it very manageable, especially if you apply the eyeball filter to 
topics:) If someone wants to just drop in for a single question there is 
the IRC channel:


https://www.postgresql.org/community/irc/

or Stackoverflow:

http://stackoverflow.com/questions/tagged/postgresql



That would be fine too, but don't put it like "if you this is too much
work, you shouldn't be using postgresql".





--
Adrian Klaver
adrian.kla...@aklaver.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] expensive function in select list vs limit clause

2017-04-05 Thread Tom Lane
Chris Mair  writes:
>  From the timings it appears that in the second explain analyze query a 
> function
> call in the select list (expensive()) is evaluated in the sequential scan node
> *for each* row in big, despite the use of limit.

According to the SQL standard, the SELECT list is evaluated before ORDER
BY, so if you need an explicit sort step the function is going to get
calculated first.  This is obviously necessary if the function is used
as the sort key, but otherwise it's possible to be smarter.  We were not
smarter before 9.6 though.  You might find this commit message informative:

https://git.postgresql.org/gitweb/?p=postgresql.git=commitdiff=9118d03a8

regards, tom lane


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] browser interface to forums please?

2017-04-05 Thread vinny

On 2017-04-05 15:11, Vincent Veyron wrote:

On Tue, 04 Apr 2017 12:01:24 +0200
vinny  wrote:


Every time I tell someone about the mailinglists I then have to 
explain

how they can subscribe, how to create folders, filters etc. And more
often than not
they just say forget it and go to some forum.


On forums, all you see is the header for the discussion, and the
number of messages attached to it.

It makes it much more difficult to follow discussions, because you
don't know if there are new messages or not, unless you memorized how
many were there the last time you looked at it. And even then, you
can't tell whether you even read them previously or not, which a
mailing list will tell you, because the messages are marked.


It depends entirely on which forum software you use.
If keeping track of read messages is a requirement then you would 
obviously

use a forum that does that for you.

But again, I'm not saying the mailinglist should be replaced by a forum.
What I'm saying is that many users find forums a lot easier to use and 
give the choice,
they will opt for the forum. Hence it makes sense to provide something 
for those users,

if there is the manpower to do so.


Can you expect Joe
Average to do something like that
if they want to get more involved in PgSQL?



How hard is it to subscribe, create a folder and a filter? If that is
too involved, I don't see how they can get involved in postgres
anyway.


That might be true if you are talking about contributors, sure, but 
we're not.
Or at least, I'm not, and I guess that's where I'm mistaking. Perhaps 
the mailinglists

are the way they are to encourage the more serious users to use them,
and keep everyday questions out a little.

That would be fine too, but don't put it like "if you this is too much 
work, you shouldn't be using postgresql".



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Query never completes with low work_mem (at least not within one hour)

2017-04-05 Thread Tom Lane
Daniel Westermann  writes:
> Thank you, Merlin. As said I know that "not in" is not a good choice in this 
> case but I still do not get what is going here. Why does the server 
> repeatedly search for NULL values when I decrease work_mem and why not when 
> increasing work_mem? 

The core point is that one plan is using a hashed subplan and the other is
not, because the planner estimated that the hashtable wouldn't fit into
work_mem.  With a hashtable you'll have one probe into the hashtable per
outer row, and each probe is O(1) unless you are unlucky about data
distributions, so the runtime is more or less linear.  Without a
hashtable, the inner table is rescanned for each outer row, so the
runtime is O(N^2) which gets pretty bad pretty fast.  "Materializing"
the inner table doesn't really help: it gets rid of per-inner-row
visibility checks and some buffer locking overhead, so it cuts the
constant factor some, but the big-O situation is still disastrous.

regards, tom lane


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] browser interface to forums please?

2017-04-05 Thread Vincent Veyron
On Tue, 04 Apr 2017 12:01:24 +0200
vinny  wrote:
> 
> Every time I tell someone about the mailinglists I then have to explain
> how they can subscribe, how to create folders, filters etc. And more 
> often than not
> they just say forget it and go to some forum.

On forums, all you see is the header for the discussion, and the number of 
messages attached to it. 

It makes it much more difficult to follow discussions, because you don't know 
if there are new messages or not, unless you memorized how many were there the 
last time you looked at it. And even then, you can't tell whether you even read 
them previously or not, which a mailing list will tell you, because the 
messages are marked.

> Can you expect Joe 
> Average to do something like that
> if they want to get more involved in PgSQL?
> 

How hard is it to subscribe, create a folder and a filter? If that is too 
involved, I don't see how they can get involved in postgres anyway.


-- 
Bien à vous, Vincent Veyron 

https://marica.fr
Logiciel de gestion des contentieux juridiques, des contrats et des sinistres 
d'assurance


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Keycloak and Postgres

2017-04-05 Thread Bill Moran
On Wed, 5 Apr 2017 07:24:32 +
Marc Tempelmeier  wrote:
> 
> Can you elaborate a bit on this part:
> " Because of how Postgres caches changes, you may find that a failover 
> requires some time in recovery mode."

https://www.postgresql.org/docs/9.6/static/wal-intro.html

The WAL requires that any unexpected shutdown of Postgres (where it doesn't get 
to explicitly
flush data pages to disk) go through a recovery cycle to fix anything in the 
WAL that is not
yet in the data pages.

Doing disk level replication and using that as a failover essentially 
duplicates a crash
on the PostgreSQL end when you failover.

-- 
Bill Moran 


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Unexpected interval comparison

2017-04-05 Thread Kyotaro HORIGUCHI
Mmm. It's shameful.

At Tue, 04 Apr 2017 18:06:38 -0400, Tom Lane  wrote in 
<5084.1491343...@sss.pgh.pa.us>
> Kyotaro HORIGUCHI  writes:
> > The first attached is the revised patch and the second is
> > temporary sanity check code for non-128bit environment code. (but
> > works only on 128 bit environment)
> 
> This seemed to me to be probably even less correct, so I extracted
> the addition and multiplication logic into a standalone test program
> (attached), which compares the result of a multiplication to that
> of native int128 arithmetic.  I changed the order of the LinearInterval
> fields to be LS-first so that I could overlay them onto an int128
> result (on a little-endian machine); this is just for testing purposes
> not something we must do in the finished code.  I soon found cases
> where it indeed fails, eg
> 
> $ ./a.out 0x7 0x7
> 7 * 7
> result = 62 18446744004990074881
> result = 3E FFF1
> MISMATCH!
> result = 63 18446744004990074881
> result = 3F FFF1

I admit that I was careless about that.

> After fooling with it for awhile, I decided that the cause of the
> problems was basically not thinking carefully about the lower half
> of the value being unsigned: that affects when to do carries in
> the addition macro, and we also have to be careful about whether
> or not to sign-extend the partial product terms.  The second
> attached file is a version that I can't break anymore, though I'm
> not quite sure it's bug-free.

In the first version, I converted all operands to positive
numbers and finally correct the sign of the result. This was
straightforward but doesn't looked clean so I tried direct
handling of singed values. It was the beginning of this mess.

There was a time when the lower bits is in uint64 but I was
confused by arithmetics mixing signed and unsigned. I realized
that I misunderstood composing 64 bit value from decomposition of
a negative int64.


I reconsidered this based on Tom's test program and try to give
reasoning for the algorithm of the attached new version.

1. Now INT64_AL32 returns explicitly casted into uint64 (for
  safety).  the upper and lower 32 bit values surely makes the
  original value just by INT64_AU32 << 32 + INT64_AL32 because

  1.1. The arithmetic is done assuming that both operands of the
 addition are in signed int64 but the MSB of the right
 operand is always 0 so no difference by reading it as
 singed.

- As mentioned in added comment, all terms (stored in the
  variable tmp) are the products of two signed/unsigned 32-bit
  values expanded to singed int64 variables. This is safe.

- The second and third terms should be left-shifted by 32 bit in
  virtually-128-bit storage then added to exiting 128 bit
  value. This can be said as adding INT128_AU64(tmp<<32) into hi
  part. If INT128_AU64(tmp<<32) is equivalent to
  INT64_AU32(tmp)>>32, "span.hi += INT64_AU32(tmp)" is safe.

   INT128_AU64(tmp << 32)  ; tmp is assumed signed int128 here
= ((tmp << 32) >> 64)
= tmp >>32
= INT64_AU32(tmp)  ; here, tmp is safe even if singed int64

  Similary,

   INT128_AL64(tmp << 32)   
= (uint128)(tmp << 32) & 0x_  (lower 32 bits are always 0)
= ((uint64)(tmp) & 0x) << 32
= INT64_AL32(tmp) << 32

- The last thing I should confirm is that
  LINEARINTERVAL_ADD_UINT64 is correct. This is analogous to 1
  above.

(int128)x + (uint64)y
  = (int128)x + (int128)y   ; safely extended without change
  =  (INT128AU64(x) << 64 + INT128AL64(x))
   + (INT128AU64(y) << 64 + INT128AL64(y))
  =  (INT128AU64(x) + INT128AU64(y)) << 64  ; performed as signed
   + (INT128AL64(x) + INT128AL64(y)); performed as unsigned

  Where (INT128AL64(x) + INT128AL64(y)) is performed as unsigned
  and can be overflow, and the carry can be just pushed to the
  upper 64bit.

  2. Adding two values with MSB of 0 doesn't overflow.

  3. If at least one of the two has MSB of 1, it can be overflow.

  3.1. Both have MSB of 1, it must overflow.

  3.2. If one of the two has MSB of 1, the maximum overflowed
 value is made by 0xFFF...FF + 0x7FF...FF and result is
 0x(1)7FF..FE so "MSB of the result is 0" and "overflowed" is
 equivalent for the case.

Addition to all of the above, dayfraction should be positive so
that LINEARINTERVAL_ADD_UINT64 can be used.

The attached patch is the revised version.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
>From 2461ef9b3711e5e72368deb18ad92a5989be4f19 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi 
Date: Wed, 5 Apr 2017 20:01:01 +0900
Subject: [PATCH] Fix overflow during interval comparison.

The values in interval are compared by TimeOffset results of
interval_cmp_value but it is so narrow that overflows quite easily.
This patch widen the output of the function to 128 bit. For platforms
without 128 bit arithmetic, 

Re: [GENERAL] expensive function in select list vs limit clause

2017-04-05 Thread Albe Laurenz
Chris Mair wrote:
> I've found a (simple) situation where the planner does something I don't 
> understand.
> 
> Below is a complete test case followed by output.
> 
>  From the timings it appears that in the second explain analyze query a 
> function
> call in the select list (expensive()) is evaluated in the sequential scan node
> *for each* row in big, despite the use of limit.
> 
> I would have expected expensive() to be evaluated only for the ten rows
> in the result set. Hence the second explain analyze query shouldn't be more
> expensive than the first one.
> 
> My trust in Postgres' planner goes so far as I feel the planner is right and 
> there
> must be a reason for this :)
> 
> Could someone help me understand this behaviour?
[...]
> create function expensive() returns double precision as
> $$
>  begin
>  for i in 1 .. 15000 loop
>  end loop;
>  return random();
>  end;
> $$ language 'plpgsql';

This is unrelated, but you should set COST for an expensive function
to help the planner.

[...]
> -- now do the same, but add an expensive() column to the result:
> -- takes ~ 29s => WHY?
> 
> explain analyze select r, expensive() from big order by r offset 0 limit 10;
[...]
>QUERY PLAN
> --
> 
>   Limit  (cost=286034.64..286034.67 rows=10 width=8) (actual 
> time=28932.311..28932.314
> rows=10 loops=1)
> ->  Sort  (cost=286034.64..288534.64 rows=100 width=8) (actual
> time=28932.309..28932.310 rows=10 loops=1)
>   Sort Key: r
>   Sort Method: top-N heapsort  Memory: 25kB
>   ->  Seq Scan on big  (cost=0.00..264425.00 rows=100 width=8) 
> (actual
> time=0.062..28822.520 rows=100 loops=1)
>   Planning time: 0.038 ms
>   Execution time: 28932.339 ms
> (7 rows)

ORDER BY can only be processed after all rows have been fetched, this
includes the expensive result column.

You can easily avoid that by applying the LIMIT first:

  SELECT r, expensive()
  FROM (SELECT r
FROM big
ORDER BY r
LIMIT 10
   ) inner;

I don't know how hard it would be to only fetch the necessary columns before
the ORDER BY and fetch the others after the LIMIT has been applied, but it
is probably nontrivial and would require processing time for *everybody*
who runs a query with ORDER BY to solve a rare problem that can easily be
worked around.

Yours,
Laurenz Albe

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] [ADMIN] calculating table and index size

2017-04-05 Thread Steven Chang
Hello,

try pgstattuple() and pgstatindex() , I think you will figure it out.

Steven

2017-04-05 16:56 GMT+08:00 Guillaume Lelarge :

> Hi,
>
> 2017-04-05 9:44 GMT+02:00 Günce Kaya :
>
>> Hi all,
>>
>> I have some questions about calculating table and index size.
>>
>> I have a dummy table which has an integer column and its index. The table
>> has 140 rows and all of rows are same thats value is 2000. Table
>> size is 50MB and index size is 31MB. Why there is too much size difference
>> between table and its index? what happen on data files when we add index?
>>
>>
> You have metadata informations in the table datafiles that you don't have
> on the index datafiles. For example, all the system columns for each line.
>
>
>> Second question is that after created table, table size was 0 byte. I
>> inserted a row as 120 then table size was 8192 byte. I inserted five times
>> same value to the table and table size is still 8192 bytes. Table size
>> changed after inserted lots of rows. Table size was stabile till first few
>> hundred rows. why table size didn't change when I inserted lots of rows?
>>
>>
> PostgreSQL works with 8KB blocks. When you insert a line, it puts it on a
> block, but this block may contain many lines. So your next new lines still
> fit in the first block... until it doesn't, and you'll see a new block
> coming, making your table datafile grows to 16KB. And so on and so on.
>
>
> --
> Guillaume.
>   http://blog.guillaume.lelarge.info
>   http://www.dalibo.com
>


[GENERAL] How does hot_standby_feedback work

2017-04-05 Thread Александр Киселев
Hello! My name is Alexander. I am an administarator PostgreSQL.I am studying PostgreSQL's 9.6 documentation.I am interested in parameter hot_standby_feedback in postgresql.confCan you explain more detail than in documentation for what this parameter is used?In what situations the parameter can show itself?How to test influence of the parameter on database's behavour. Regards, Alexander



Re: [GENERAL] Query never completes with low work_mem (at least not within one hour)

2017-04-05 Thread Daniel Westermann
2017-04-05 10:33 GMT+02:00 Daniel Westermann < 
daniel.westerm...@dbi-services.com > : 



2017-04-05 10:13 GMT+02:00 Daniel Westermann < 
daniel.westerm...@dbi-services.com > : 

BQ_BEGIN

2017-04-05 9:28 GMT+02:00 Daniel Westermann < 
daniel.westerm...@dbi-services.com > : 

BQ_BEGIN

>>what is result of EXPLAIN statement for slow and fast cases? 
>> 
>>regards 
>> 
>>Pavel 

For work_mem=32MB 

explain (analyze,verbose,buffers) select count(user_id) from users where 
user_id not in ( select id from ids); 
QUERY PLAN 
--
 
Aggregate (cost=83175.01..83175.02 rows=1 width=8) (actual 
time=7945.592..7945.593 rows=1 loops=1) 
Output: count(users.user_id) 
Buffers: shared read=29425 
-> Seq Scan on public.users (cost=16925.01..79425.01 rows=150 width=4) 
(actual time=1928.665..5888.645 rows=249 loops=1) 
Output: users.user_id, users.username 
Filter: (NOT (hashed SubPlan 1)) 
Rows Removed by Filter: 51 
Buffers: shared read=29425 
SubPlan 1 
-> Seq Scan on public.ids (cost=0.00..14425.01 rows=101 width=4) (actual 
time=11.111..868.382 rows=101 loops=1) 
Output: ids.id 
Buffers: shared read=4425 
Planning time: 187.396 ms 
Execution time: 7948.108 ms 
(14 rows) 

Time: 8244.493 ms 

For work_mem='16MB' it does not complete with analyze in on hour. For explain 
only: 

explain (verbose) select count(user_id) from users where user_id not in ( 
select id from ids); 
QUERY PLAN 

 
Aggregate (cost=38748092500.00..38748092500.01 rows=1 width=8) 
Output: count(users.user_id) 
-> Seq Scan on public.users (cost=0.00..38748088750.00 rows=150 width=4) 
Output: users.user_id, users.username 
Filter: (NOT (SubPlan 1)) 
SubPlan 1 
-> Materialize (cost=0.00..23332.01 rows=101 width=4) 
Output: ids.id 
-> Seq Scan on public.ids (cost=0.00..14425.01 rows=101 width=4) 
Output: ids.id 
(10 rows) 




>There is a materialize op more 
> 
>do you have a index on ids.id ? 

Yes: 

\d ids 
Table "public.ids" 
Column | Type | Modifiers 
+-+--- 
id | integer | 
Indexes: 
"i_ids" UNIQUE, btree (id) 


BQ_END



>>hmm .. NOT IN is just bad :( 
>> 
>>The second is slow becase table ids is stored in temp file. and it is 
>>repeatedly read from file. In first case, ids table is stored in memory. 
>> 
>>SELECT count(user_id) FROM users WHERE user_id NOT EXISTS(SELECT id from ids 
>>where id = user_id) 
>> 

Yes, really bad :) ... and I still do not get it. Even when reading from the 
tempfile all the time it should at least complete within one hour, shouldn't 
it? The tables are not so big: 

select * from pg_size_pretty ( pg_relation_size ('ids' )); 
pg_size_pretty 
 
35 MB 
(1 row) 
select * from pg_size_pretty ( pg_relation_size ('users' )); 
pg_size_pretty 
 
195 MB 
(1 row) 


BQ_END

>> 150 * few ms ~ big time 

Ok got it 
Thanks 
Pavel 



Re: [GENERAL] [ADMIN] calculating table and index size

2017-04-05 Thread Guillaume Lelarge
Hi,

2017-04-05 9:44 GMT+02:00 Günce Kaya :

> Hi all,
>
> I have some questions about calculating table and index size.
>
> I have a dummy table which has an integer column and its index. The table
> has 140 rows and all of rows are same thats value is 2000. Table
> size is 50MB and index size is 31MB. Why there is too much size difference
> between table and its index? what happen on data files when we add index?
>
>
You have metadata informations in the table datafiles that you don't have
on the index datafiles. For example, all the system columns for each line.


> Second question is that after created table, table size was 0 byte. I
> inserted a row as 120 then table size was 8192 byte. I inserted five times
> same value to the table and table size is still 8192 bytes. Table size
> changed after inserted lots of rows. Table size was stabile till first few
> hundred rows. why table size didn't change when I inserted lots of rows?
>
>
PostgreSQL works with 8KB blocks. When you insert a line, it puts it on a
block, but this block may contain many lines. So your next new lines still
fit in the first block... until it doesn't, and you'll see a new block
coming, making your table datafile grows to 16KB. And so on and so on.


-- 
Guillaume.
  http://blog.guillaume.lelarge.info
  http://www.dalibo.com


Re: [GENERAL] Query never completes with low work_mem (at least not within one hour)

2017-04-05 Thread Pavel Stehule
2017-04-05 10:33 GMT+02:00 Daniel Westermann <
daniel.westerm...@dbi-services.com>:

> 2017-04-05 10:13 GMT+02:00 Daniel Westermann  services.com>:
>
>> 2017-04-05 9:28 GMT+02:00 Daniel Westermann > services.com>:
>>
>>> >>what is result of EXPLAIN statement for slow and fast cases?
>>> >>
>>> >>regards
>>> >>
>>> >>Pavel
>>>
>>> For work_mem=32MB
>>>
>>> explain (analyze,verbose,buffers)  select count(user_id) from users
>>> where user_id not in ( select id from ids);
>>>   QUERY
>>> PLAN
>>> 
>>> 
>>> --
>>>  Aggregate  (cost=83175.01..83175.02 rows=1 width=8) (actual
>>> time=7945.592..7945.593 rows=1 loops=1)
>>>Output: count(users.user_id)
>>>Buffers: shared read=29425
>>>->  Seq Scan on public.users  (cost=16925.01..79425.01 rows=150
>>> width=4) (actual time=1928.665..5888.645 rows=249 loops=1)
>>>  Output: users.user_id, users.username
>>>  Filter: (NOT (hashed SubPlan 1))
>>>  Rows Removed by Filter: 51
>>>  Buffers: shared read=29425
>>>  SubPlan 1
>>>->  Seq Scan on public.ids  (cost=0.00..14425.01 rows=101
>>> width=4) (actual time=11.111..868.382 rows=101 loops=1)
>>>  Output: ids.id
>>>  Buffers: shared read=4425
>>>  Planning time: 187.396 ms
>>>  Execution time: 7948.108 ms
>>> (14 rows)
>>>
>>> Time: 8244.493 ms
>>>
>>> For work_mem='16MB' it does not complete with analyze in on hour. For
>>> explain only:
>>>
>>> explain (verbose)  select count(user_id) from users where user_id not in
>>> ( select id from ids);
>>>QUERY
>>> PLAN
>>> 
>>> 
>>>  Aggregate  (cost=38748092500.00..38748092500.01 rows=1 width=8)
>>>Output: count(users.user_id)
>>>->  Seq Scan on public.users  (cost=0.00..38748088750.00 rows=150
>>> width=4)
>>>  Output: users.user_id, users.username
>>>  Filter: (NOT (SubPlan 1))
>>>  SubPlan 1
>>>->  Materialize  (cost=0.00..23332.01 rows=101 width=4)
>>>  Output: ids.id
>>>  ->  Seq Scan on public.ids  (cost=0.00..14425.01
>>> rows=101 width=4)
>>>Output: ids.id
>>> (10 rows)
>>>
>>
>> >There is a materialize op more
>> >
>> >do you have a index on ids.id?
>>
>> Yes:
>>
>> \d ids
>>   Table "public.ids"
>>  Column |  Type   | Modifiers
>> +-+---
>>  id | integer |
>> Indexes:
>> "i_ids" UNIQUE, btree (id)
>>
>>
>
> >>hmm .. NOT IN is just bad :(
> >>
> >>The second is slow becase table ids is stored in temp file. and it is
> repeatedly read from file. In first case, ids table is stored in memory.
> >>
> >>SELECT count(user_id) FROM users WHERE user_id NOT EXISTS(SELECT id from
> ids where id = user_id)
> >>
>
> Yes, really bad :) ... and I still do not get it. Even when reading from
> the tempfile all the time it should at least complete within one hour,
> shouldn't it? The tables are not so big:
>
> select * from pg_size_pretty ( pg_relation_size ('ids' ));
>  pg_size_pretty
> 
>  35 MB
> (1 row)
> select * from pg_size_pretty ( pg_relation_size ('users' ));
>  pg_size_pretty
> 
>  195 MB
> (1 row)
>
>

150 * few ms ~ big time


Re: [GENERAL] Query never completes with low work_mem (at least not within one hour)

2017-04-05 Thread Daniel Westermann
2017-04-05 10:13 GMT+02:00 Daniel Westermann < 
daniel.westerm...@dbi-services.com > : 



2017-04-05 9:28 GMT+02:00 Daniel Westermann < 
daniel.westerm...@dbi-services.com > : 

BQ_BEGIN

>>what is result of EXPLAIN statement for slow and fast cases? 
>> 
>>regards 
>> 
>>Pavel 

For work_mem=32MB 

explain (analyze,verbose,buffers) select count(user_id) from users where 
user_id not in ( select id from ids); 
QUERY PLAN 
--
 
Aggregate (cost=83175.01..83175.02 rows=1 width=8) (actual 
time=7945.592..7945.593 rows=1 loops=1) 
Output: count(users.user_id) 
Buffers: shared read=29425 
-> Seq Scan on public.users (cost=16925.01..79425.01 rows=150 width=4) 
(actual time=1928.665..5888.645 rows=249 loops=1) 
Output: users.user_id, users.username 
Filter: (NOT (hashed SubPlan 1)) 
Rows Removed by Filter: 51 
Buffers: shared read=29425 
SubPlan 1 
-> Seq Scan on public.ids (cost=0.00..14425.01 rows=101 width=4) (actual 
time=11.111..868.382 rows=101 loops=1) 
Output: ids.id 
Buffers: shared read=4425 
Planning time: 187.396 ms 
Execution time: 7948.108 ms 
(14 rows) 

Time: 8244.493 ms 

For work_mem='16MB' it does not complete with analyze in on hour. For explain 
only: 

explain (verbose) select count(user_id) from users where user_id not in ( 
select id from ids); 
QUERY PLAN 

 
Aggregate (cost=38748092500.00..38748092500.01 rows=1 width=8) 
Output: count(users.user_id) 
-> Seq Scan on public.users (cost=0.00..38748088750.00 rows=150 width=4) 
Output: users.user_id, users.username 
Filter: (NOT (SubPlan 1)) 
SubPlan 1 
-> Materialize (cost=0.00..23332.01 rows=101 width=4) 
Output: ids.id 
-> Seq Scan on public.ids (cost=0.00..14425.01 rows=101 width=4) 
Output: ids.id 
(10 rows) 




>There is a materialize op more 
> 
>do you have a index on ids.id ? 

Yes: 

\d ids 
Table "public.ids" 
Column | Type | Modifiers 
+-+--- 
id | integer | 
Indexes: 
"i_ids" UNIQUE, btree (id) 


BQ_END



>>hmm .. NOT IN is just bad :( 
>> 
>>The second is slow becase table ids is stored in temp file. and it is 
>>repeatedly read from file. In first case, ids table is stored in memory. 
>> 
>>SELECT count(user_id) FROM users WHERE user_id NOT EXISTS(SELECT id from ids 
>>where id = user_id) 
>> 

Yes, really bad :) ... and I still do not get it. Even when reading from the 
tempfile all the time it should at least complete within one hour, shouldn't 
it? The tables are not so big: 

select * from pg_size_pretty ( pg_relation_size ('ids' )); 
pg_size_pretty 
 
35 MB 
(1 row) 
select * from pg_size_pretty ( pg_relation_size ('users' )); 
pg_size_pretty 
 
195 MB 
(1 row) 




Re: [GENERAL] Query never completes with low work_mem (at least not within one hour)

2017-04-05 Thread Pavel Stehule
2017-04-05 10:13 GMT+02:00 Daniel Westermann <
daniel.westerm...@dbi-services.com>:

> 2017-04-05 9:28 GMT+02:00 Daniel Westermann  services.com>:
>
>> >>what is result of EXPLAIN statement for slow and fast cases?
>> >>
>> >>regards
>> >>
>> >>Pavel
>>
>> For work_mem=32MB
>>
>> explain (analyze,verbose,buffers)  select count(user_id) from users where
>> user_id not in ( select id from ids);
>>   QUERY
>> PLAN
>> 
>> 
>> --
>>  Aggregate  (cost=83175.01..83175.02 rows=1 width=8) (actual
>> time=7945.592..7945.593 rows=1 loops=1)
>>Output: count(users.user_id)
>>Buffers: shared read=29425
>>->  Seq Scan on public.users  (cost=16925.01..79425.01 rows=150
>> width=4) (actual time=1928.665..5888.645 rows=249 loops=1)
>>  Output: users.user_id, users.username
>>  Filter: (NOT (hashed SubPlan 1))
>>  Rows Removed by Filter: 51
>>  Buffers: shared read=29425
>>  SubPlan 1
>>->  Seq Scan on public.ids  (cost=0.00..14425.01 rows=101
>> width=4) (actual time=11.111..868.382 rows=101 loops=1)
>>  Output: ids.id
>>  Buffers: shared read=4425
>>  Planning time: 187.396 ms
>>  Execution time: 7948.108 ms
>> (14 rows)
>>
>> Time: 8244.493 ms
>>
>> For work_mem='16MB' it does not complete with analyze in on hour. For
>> explain only:
>>
>> explain (verbose)  select count(user_id) from users where user_id not in
>> ( select id from ids);
>>QUERY
>> PLAN
>> 
>> 
>>  Aggregate  (cost=38748092500.00..38748092500.01 rows=1 width=8)
>>Output: count(users.user_id)
>>->  Seq Scan on public.users  (cost=0.00..38748088750.00 rows=150
>> width=4)
>>  Output: users.user_id, users.username
>>  Filter: (NOT (SubPlan 1))
>>  SubPlan 1
>>->  Materialize  (cost=0.00..23332.01 rows=101 width=4)
>>  Output: ids.id
>>  ->  Seq Scan on public.ids  (cost=0.00..14425.01
>> rows=101 width=4)
>>Output: ids.id
>> (10 rows)
>>
>
> >There is a materialize op more
> >
> >do you have a index on ids.id?
>
> Yes:
>
> \d ids
>   Table "public.ids"
>  Column |  Type   | Modifiers
> +-+---
>  id | integer |
> Indexes:
> "i_ids" UNIQUE, btree (id)
>
>

hmm .. NOT IN is just bad :(

The second is slow becase table ids is stored in temp file. and it is
repeatedly read from file. In first case, ids table is stored in memory.

SELECT count(user_id) FROM users WHERE user_id NOT EXISTS(SELECT id from
ids where id = user_id)

Regards

Pavel


Re: [GENERAL] Query never completes with low work_mem (at least not within one hour)

2017-04-05 Thread Daniel Westermann
2017-04-05 9:28 GMT+02:00 Daniel Westermann < 
daniel.westerm...@dbi-services.com > : 



>>what is result of EXPLAIN statement for slow and fast cases? 
>> 
>>regards 
>> 
>>Pavel 

For work_mem=32MB 

explain (analyze,verbose,buffers) select count(user_id) from users where 
user_id not in ( select id from ids); 
QUERY PLAN 
--
 
Aggregate (cost=83175.01..83175.02 rows=1 width=8) (actual 
time=7945.592..7945.593 rows=1 loops=1) 
Output: count(users.user_id) 
Buffers: shared read=29425 
-> Seq Scan on public.users (cost=16925.01..79425.01 rows=150 width=4) 
(actual time=1928.665..5888.645 rows=249 loops=1) 
Output: users.user_id, users.username 
Filter: (NOT (hashed SubPlan 1)) 
Rows Removed by Filter: 51 
Buffers: shared read=29425 
SubPlan 1 
-> Seq Scan on public.ids (cost=0.00..14425.01 rows=101 width=4) (actual 
time=11.111..868.382 rows=101 loops=1) 
Output: ids.id 
Buffers: shared read=4425 
Planning time: 187.396 ms 
Execution time: 7948.108 ms 
(14 rows) 

Time: 8244.493 ms 

For work_mem='16MB' it does not complete with analyze in on hour. For explain 
only: 

explain (verbose) select count(user_id) from users where user_id not in ( 
select id from ids); 
QUERY PLAN 

 
Aggregate (cost=38748092500.00..38748092500.01 rows=1 width=8) 
Output: count(users.user_id) 
-> Seq Scan on public.users (cost=0.00..38748088750.00 rows=150 width=4) 
Output: users.user_id, users.username 
Filter: (NOT (SubPlan 1)) 
SubPlan 1 
-> Materialize (cost=0.00..23332.01 rows=101 width=4) 
Output: ids.id 
-> Seq Scan on public.ids (cost=0.00..14425.01 rows=101 width=4) 
Output: ids.id 
(10 rows) 




>There is a materialize op more 
> 
>do you have a index on ids.id ? 

Yes: 

\d ids 
Table "public.ids" 
Column | Type | Modifiers 
+-+--- 
id | integer | 
Indexes: 
"i_ids" UNIQUE, btree (id) 



Re: [GENERAL] Query never completes with low work_mem (at least not within one hour)

2017-04-05 Thread Pavel Stehule
2017-04-05 9:28 GMT+02:00 Daniel Westermann <
daniel.westerm...@dbi-services.com>:

> >>what is result of EXPLAIN statement for slow and fast cases?
> >>
> >>regards
> >>
> >>Pavel
>
> For work_mem=32MB
>
> explain (analyze,verbose,buffers)  select count(user_id) from users where
> user_id not in ( select id from ids);
>   QUERY
> PLAN
> 
> --
>  Aggregate  (cost=83175.01..83175.02 rows=1 width=8) (actual
> time=7945.592..7945.593 rows=1 loops=1)
>Output: count(users.user_id)
>Buffers: shared read=29425
>->  Seq Scan on public.users  (cost=16925.01..79425.01 rows=150
> width=4) (actual time=1928.665..5888.645 rows=249 loops=1)
>  Output: users.user_id, users.username
>  Filter: (NOT (hashed SubPlan 1))
>  Rows Removed by Filter: 51
>  Buffers: shared read=29425
>  SubPlan 1
>->  Seq Scan on public.ids  (cost=0.00..14425.01 rows=101
> width=4) (actual time=11.111..868.382 rows=101 loops=1)
>  Output: ids.id
>  Buffers: shared read=4425
>  Planning time: 187.396 ms
>  Execution time: 7948.108 ms
> (14 rows)
>
> Time: 8244.493 ms
>
> For work_mem='16MB' it does not complete with analyze in on hour. For
> explain only:
>
> explain (verbose)  select count(user_id) from users where user_id not in (
> select id from ids);
>QUERY
> PLAN
> 
> 
>  Aggregate  (cost=38748092500.00..38748092500.01 rows=1 width=8)
>Output: count(users.user_id)
>->  Seq Scan on public.users  (cost=0.00..38748088750.00 rows=150
> width=4)
>  Output: users.user_id, users.username
>  Filter: (NOT (SubPlan 1))
>  SubPlan 1
>->  Materialize  (cost=0.00..23332.01 rows=101 width=4)
>  Output: ids.id
>  ->  Seq Scan on public.ids  (cost=0.00..14425.01
> rows=101 width=4)
>Output: ids.id
> (10 rows)
>

There is a materialize op more

do you have a index on ids.id?

Pavel


[GENERAL] expensive function in select list vs limit clause

2017-04-05 Thread Chris Mair

Hi,

I've found a (simple) situation where the planner does something I don't 
understand.

Below is a complete test case followed by output.

From the timings it appears that in the second explain analyze query a function
call in the select list (expensive()) is evaluated in the sequential scan node
*for each* row in big, despite the use of limit.

I would have expected expensive() to be evaluated only for the ten rows
in the result set. Hence the second explain analyze query shouldn't be more
expensive than the first one.

My trust in Postgres' planner goes so far as I feel the planner is right and 
there
must be a reason for this :)

Could someone help me understand this behaviour?

Thanks & Bye,
Chris.



-- ***

select version();

-- setup: create a time wasting function and a table with 1M rows

create function expensive() returns double precision as
$$
begin
for i in 1 .. 15000 loop
end loop;
return random();
end;
$$ language 'plpgsql';

create table big as select random() as r from generate_series(1, 100);
analyze big;

\timing on

-- benchmark expensive(): one call to expensive takes about 0.3 ms => OK

do $$ begin for i in 1 .. 1000 loop perform expensive(); end loop; end; $$;

-- find the ten smallest values in big: takes ~ 0.18s => OK

explain analyze select r from big order by r offset 0 limit 10;

-- now do the same, but add an expensive() column to the result:
-- takes ~ 29s => WHY?

explain analyze select r, expensive() from big order by r offset 0 limit 10;

-- clean up :)

\timing off
drop function expensive();
drop table big;

-- ***

   version
--
 PostgreSQL 9.5.4 on x86_64-apple-darwin14.5.0, compiled by Apple LLVM version 
7.0.2 (clang-700.1.81), 64-bit
(1 row)

Time: 0.386 ms
CREATE FUNCTION
Time: 0.456 ms
SELECT 100
Time: 466.814 ms
ANALYZE
Time: 73.770 ms
Timing is on.
DO
Time: 33.922 ms
QUERY PLAN
--
 Limit  (cost=36034.64..36034.67 rows=10 width=8) (actual time=182.361..182.363 
rows=10 loops=1)
   ->  Sort  (cost=36034.64..38534.64 rows=100 width=8) (actual 
time=182.360..182.361 rows=10 loops=1)
 Sort Key: r
 Sort Method: top-N heapsort  Memory: 25kB
 ->  Seq Scan on big  (cost=0.00..14425.00 rows=100 width=8) 
(actual time=0.022..99.777 rows=100 loops=1)
 Planning time: 0.070 ms
 Execution time: 182.377 ms
(7 rows)

Time: 182.689 ms
  QUERY PLAN
--
 Limit  (cost=286034.64..286034.67 rows=10 width=8) (actual 
time=28932.311..28932.314 rows=10 loops=1)
   ->  Sort  (cost=286034.64..288534.64 rows=100 width=8) (actual 
time=28932.309..28932.310 rows=10 loops=1)
 Sort Key: r
 Sort Method: top-N heapsort  Memory: 25kB
 ->  Seq Scan on big  (cost=0.00..264425.00 rows=100 width=8) 
(actual time=0.062..28822.520 rows=100 loops=1)
 Planning time: 0.038 ms
 Execution time: 28932.339 ms
(7 rows)

Time: 28932.908 ms
Timing is off.
DROP FUNCTION
DROP TABLE

-- ***



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] calculating table and index size

2017-04-05 Thread Günce Kaya
Hi all,

I have some questions about calculating table and index size.

I have a dummy table which has an integer column and its index. The table
has 140 rows and all of rows are same thats value is 2000. Table
size is 50MB and index size is 31MB. Why there is too much size difference
between table and its index? what happen on data files when we add index?

Second question is that after created table, table size was 0 byte. I
inserted a row as 120 then table size was 8192 byte. I inserted five times
same value to the table and table size is still 8192 bytes. Table size
changed after inserted lots of rows. Table size was stabile till first few
hundred rows. why table size didn't change when I inserted lots of rows?

Is there anyone can explain two questions?

Best regards
-- 
Gunce Kaya

Linkedin  - Twitter
 - Blog



Re: [GENERAL] Query never completes with low work_mem (at least not within one hour)

2017-04-05 Thread Daniel Westermann
>>what is result of EXPLAIN statement for slow and fast cases? 
>> 
>>regards 
>> 
>>Pavel 

For work_mem=32MB 

explain (analyze,verbose,buffers) select count(user_id) from users where 
user_id not in ( select id from ids); 
QUERY PLAN 
--
 
Aggregate (cost=83175.01..83175.02 rows=1 width=8) (actual 
time=7945.592..7945.593 rows=1 loops=1) 
Output: count(users.user_id) 
Buffers: shared read=29425 
-> Seq Scan on public.users (cost=16925.01..79425.01 rows=150 width=4) 
(actual time=1928.665..5888.645 rows=249 loops=1) 
Output: users.user_id, users.username 
Filter: (NOT (hashed SubPlan 1)) 
Rows Removed by Filter: 51 
Buffers: shared read=29425 
SubPlan 1 
-> Seq Scan on public.ids (cost=0.00..14425.01 rows=101 width=4) (actual 
time=11.111..868.382 rows=101 loops=1) 
Output: ids.id 
Buffers: shared read=4425 
Planning time: 187.396 ms 
Execution time: 7948.108 ms 
(14 rows) 

Time: 8244.493 ms 

For work_mem='16MB' it does not complete with analyze in on hour. For explain 
only: 

explain (verbose) select count(user_id) from users where user_id not in ( 
select id from ids); 
QUERY PLAN 

 
Aggregate (cost=38748092500.00..38748092500.01 rows=1 width=8) 
Output: count(users.user_id) 
-> Seq Scan on public.users (cost=0.00..38748088750.00 rows=150 width=4) 
Output: users.user_id, users.username 
Filter: (NOT (SubPlan 1)) 
SubPlan 1 
-> Materialize (cost=0.00..23332.01 rows=101 width=4) 
Output: ids.id 
-> Seq Scan on public.ids (cost=0.00..14425.01 rows=101 width=4) 
Output: ids.id 
(10 rows) 


Re: [GENERAL] Keycloak and Postgres

2017-04-05 Thread Marc Tempelmeier
Hi,

Thanks for your answer!

Can you elaborate a bit on this part:
" Because of how Postgres caches changes, you may find that a failover requires 
some time in recovery mode."

Thanks!

-Ursprüngliche Nachricht-
Von: Bill Moran [mailto:wmo...@potentialtech.com] 
Gesendet: Saturday, April 1, 2017 12:57 PM
An: Marc Tempelmeier 
Cc: pgsql-general@postgresql.org
Betreff: Re: [GENERAL] Keycloak and Postgres

On Thu, 30 Mar 2017 13:58:36 +
Marc Tempelmeier  wrote:

> Hi,
> 
> I have a replication question, we have some big Cisco UCS VM thingy, where 
> VMs are snapshotted, the drives are abstracted etc. If a VM crashes it will 
> be resumed in 1 min from another rack. What brings us master slave 
> replication or some other kind of replication in this setup? Should we do it 
> because of other failures?

Because of how Postgres caches changes, you may find that a failover requires 
some time in recovery mode. Those VM snapshot systems are great, but they 
aren't quite perfect if they don't know what is being done with the data on the 
drives.

Whether it's good enough depends heavily on what your expectation is.
Before trusting it to meet your needs, I would spend some time simulating 
failures and seeing what actually happens.

--
Bill Moran 


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Query never completes with low work_mem (at least not within one hour)

2017-04-05 Thread Pavel Stehule
2017-04-05 8:57 GMT+02:00 Daniel Westermann <
daniel.westerm...@dbi-services.com>:

> >> I have set work_mem to a very low value intentionally for demonstration
> >> purposes:
> >>
> >> postgres=# show work_mem;
> >>  work_mem
> >> --
> >>  16MB
> >> (1 row)
> >>
> >> postgres=# show shared_buffers ;
> >>  shared_buffers
> >> 
> >>  128MB
> >> (1 row)
> >>
> >>
> >> When I run the following query ( I know that "not in" is not a good
> choice
> >> here ):
> >>
> >> postgres=# select count(user_id) from users where user_id not in (
> select id
> >> from ids);
>
> >"NOT IN" where the predate is a table column can lead to very poor
> >query plans especially where the haystack is not provably known (at
> >plan time) to contain only not null values.  By reducing work_mem, the
> >server has decided has to repeatedly search the table to search for
> >the presence of null values.  Try converting the query to NOT EXISTS.
>
> Thank you, Merlin. As said I know that "not in" is not a good choice in
> this case but I still do not get what is going here. Why does the server
> repeatedly search for NULL values when I decrease work_mem and why not when
> increasing work_mem?
>

what is result of EXPLAIN statement for slow and fast cases?

regards

Pavel


>
>
> Regards
> Daniel
>


Re: [GENERAL] Query never completes with low work_mem (at least not within one hour)

2017-04-05 Thread Daniel Westermann
>> I have set work_mem to a very low value intentionally for demonstration 
>> purposes: 
>> 
>> postgres=# show work_mem; 
>> work_mem 
>> -- 
>> 16MB 
>> (1 row) 
>> 
>> postgres=# show shared_buffers ; 
>> shared_buffers 
>>  
>> 128MB 
>> (1 row) 
>> 
>> 
>> When I run the following query ( I know that "not in" is not a good choice 
>> here ): 
>> 
>> postgres=# select count(user_id) from users where user_id not in ( select id 
>> from ids); 

>"NOT IN" where the predate is a table column can lead to very poor 
>query plans especially where the haystack is not provably known (at 
>plan time) to contain only not null values. By reducing work_mem, the 
>server has decided has to repeatedly search the table to search for 
>the presence of null values. Try converting the query to NOT EXISTS. 

Thank you, Merlin. As said I know that "not in" is not a good choice in this 
case but I still do not get what is going here. Why does the server repeatedly 
search for NULL values when I decrease work_mem and why not when increasing 
work_mem? 


Regards 
Daniel 


Re: [GENERAL] keeping WAL after dropping replication slots

2017-04-05 Thread Tom DalPozzo
Hi,

2017-04-05 1:55 GMT+02:00 Adrian Klaver :

> On 04/04/2017 07:45 AM, Tom DalPozzo wrote:
>
> Postgres version?
>
9.6.1

>
> Hi,
>> I had two replication slots on my primary. Slaves off and (around 800)
>> WALs kept as expected.
>>
>
> Slaves off means?:
>

> You replication set up from the master to the slaves(how many?).
> Then you disconnected the slaves how?
>
> I have 2 slaves configured with async replication but they were down  when
I dropped the slots.

So the 800 WALs number mean you have wal_keep_segments set to 800?
>
No,  wal_keep_segments is commented.
800 is the rough number of files I saw in xlog dir before dropping the
slots.


>
>
> I dropped those slots but over time, the system kept on adding new WALs
>> without reusing them or deleting them.
>> Only after shutdown and restart the system deleted those WAL files.
>> Is that ok?
>> regards
>> Pupillo
>>
>>
>>
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>

Regards
Pupillo