Re: [GENERAL] Request to share information regarding deadlock in postgresql-8.1.18

2016-11-13 Thread Yogesh Sharma
Dear John,

Thanks for sharing information.
>this is a logic error in your application.
Ok, I will check.

>you can look those relation numbers up in the pg_catalog to see what they are. 
>  you can see what the processes are in pg_stat_activity.
Currently machine is not available. Please suggest if any other approach  to 
identify the same.


>THis has nothing to do with the growing WAL logs...something is blocking 
>checkpoints if a single WAL file keeps >growing, are you using some form of 
>wal archiving, is that working correctly ? could something 
>be preventing checkpoints?what are the related checkpoint and WAL settings?

Many WAL files are generated in pg_xlog directory. I don’t know how I can check 
checkpoint and WAL settings?
By default setting of postgresql.conf is used.

Regards,
Yogesh



-Original Message-
From: pgsql-general-ow...@postgresql.org 
[mailto:pgsql-general-ow...@postgresql.org] On Behalf Of John R Pierce
Sent: Monday, November 14, 2016 4:19 PM
To: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Request to share information regarding deadlock in 
postgresql-8.1.18

On 11/13/2016 11:01 PM, Yogesh Sharma wrote:
> DETAIL:  Process 12345 waits for AccessShareLock on relation 16446 of 
> database 16385; blocked by process 23136.
> Process 23136 waits for ShareLock on relation 16482 of database 16385; 
> blocked by process 12345.

you have two processes that are each waiting for locks the other have.

process 12345 is waiting for relation (table) with OID 16446, that table is 
blocked by process 23136

and process 23136 is waiting for a lock on another table, OID 16482, which the 
first process has a lock on.

this is a logic error in your application.

you can look those relation numbers up in the pg_catalog to see what 
they are.   you can see what the processes are in pg_stat_activity.

THis has nothing to do with the growing WAL logs...something is 
blocking checkpoints if a single WAL file keeps growing, are you using some 
form of wal archiving, is that working correctly ? could something 
be preventing checkpoints?what are the related checkpoint and WAL 
settings?



-- 
john r pierce, recycling bits in santa cruz



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Request to share information regarding deadlock in postgresql-8.1.18

2016-11-13 Thread John R Pierce

On 11/13/2016 11:01 PM, Yogesh Sharma wrote:

DETAIL:  Process 12345 waits for AccessShareLock on relation 16446 of database 
16385; blocked by process 23136.
Process 23136 waits for ShareLock on relation 16482 of database 16385; blocked 
by process 12345.


you have two processes that are each waiting for locks the other have.

process 12345 is waiting for relation (table) with OID 16446, that table 
is blocked by process 23136


and process 23136 is waiting for a lock on another table, OID 16482, 
which the first process has a lock on.


this is a logic error in your application.

you can look those relation numbers up in the pg_catalog to see what 
they are.   you can see what the processes are in pg_stat_activity.


THis has nothing to do with the growing WAL logs...something is 
blocking checkpoints if a single WAL file keeps growing, are you using 
some form of wal archiving, is that working correctly ? could something 
be preventing checkpoints?what are the related checkpoint and WAL 
settings?




--
john r pierce, recycling bits in santa cruz



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Request to share information regarding deadlock in postgresql-8.1.18

2016-11-13 Thread Yogesh Sharma
Dear All,

Thanks in advance.

In my Linux system, below error is found and after 10~12 hours, WAL file in 
pg_xlog directory are increased continuously.
PostgreSQL:ERROR:  deadlock detected
DETAIL:  Process 12345 waits for AccessShareLock on relation 16446 of database 
16385; blocked by process 23136.
Process 23136 waits for ShareLock on relation 16482 of database 16385; blocked 
by process 12345.
CONTEXT:  SQL statement "SELECT 1 FROM ONLY "public"."table name" x WHERE "id" 
= $1 AND "table_id" = $2 FOR SHARE OF x"

Is this deadlock error is link with WAL file of pg_xlog?
Please share your opinion. 

Regards,
Yogesh

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Why is this query not using GIN index?

2016-11-13 Thread Aaron Lewis
Hey guys,

I'm trying to understand the performance impact of "Index Recheck", I
googled for Index Recheck, but didn't find much details about it,
where can I know more about it?

And how did you know the performance is being significantly hurt by
inadequate work_mem?

I'm running PG 9.6.1, built from source.


On Mon, Nov 14, 2016 at 2:51 AM, Tom Lane  wrote:
> Oleg Bartunov  writes:
>> On Sun, Nov 13, 2016 at 6:05 PM, Aaron Lewis 
>>> It takes 500ms with 10m rows, could it be faster?
>
>> sure.  Recheck with function call is pretty expensive, so I'd not recommend
>> to create functional index, just create separate column of type tsvector
>> (materialize to_tsvector) and create gin index on it.  You should surprise.
>
> I doubt it'll help that much --- more than half the time is going into the
> bitmap indexscan, and with over 1m candidate matches, there's no way
> that's going to be super cheap.
>
> I wonder whether a gist index would be better here, since it would support
> a plain indexscan which should require scanning much less of the index
> given the small LIMIT.
>
> (Materializing the tsvector would probably help for gist, too, by reducing
> the cost of lossy-index rechecks.)
>
> BTW, it still looks like the performance is being significantly hurt by
> inadequate work_mem.
>
> regards, tom lane



-- 
Best Regards,
Aaron Lewis - PGP: 0x13714D33 - http://pgp.mit.edu/
Finger Print:   9F67 391B B770 8FF6 99DC  D92D 87F6 2602 1371 4D33


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Wal files being delayed - Pgsql 9.2

2016-11-13 Thread Venkata B Nagothi
On Mon, Nov 14, 2016 at 1:22 PM, Patrick B  wrote:

> Hi guys,
>
> My current scenario is:
>
> master01 - Postgres 9.2 master DB
> slave01 - Postgres 9.2 streaming replication + wal_files slave server for
> read-only queries
> slave02 - Postgres 9.2 streaming replication + wal_files slave server @ AWS
>
> master01 sends wal_files to both slaves via ssh.
>
>
> *On the master:*
>
> select * from pg_current_xlog_location();
>
>  pg_current_xlog_location
>
> --
>
>  1A7C/14AEB2C0
>
> (1 row)
>
>
> *On the slaves:*
>
> ls -ltr /var/lib/pgsql/archive/
>
> -rw--- 1 postgres postgres 16777216 Nov 14 01:21
> 00021A7A00F9
>
>
> As you can see, the last wal_files on the slaves isn't the
> 00021A7C* ones. I think the SSH delivery is being delayed. Not sure
> why tho.
> How can I see how many files are behind?
>


You can identify the file name by using the function pg_xlogfile_name().

"select pg_xlogfile_name('1A7C/14AEB2C0');"


Regards,

Venkata B N
Database Consultant

Fujitsu Australia


[GENERAL] Wal files being delayed - Pgsql 9.2

2016-11-13 Thread Patrick B
Hi guys,

My current scenario is:

master01 - Postgres 9.2 master DB
slave01 - Postgres 9.2 streaming replication + wal_files slave server for
read-only queries
slave02 - Postgres 9.2 streaming replication + wal_files slave server @ AWS

master01 sends wal_files to both slaves via ssh.


*On the master:*

select * from pg_current_xlog_location();

 pg_current_xlog_location

--

 1A7C/14AEB2C0

(1 row)


*On the slaves:*

ls -ltr /var/lib/pgsql/archive/

-rw--- 1 postgres postgres 16777216 Nov 14 01:21
00021A7A00F9


As you can see, the last wal_files on the slaves isn't the
00021A7C* ones. I think the SSH delivery is being delayed. Not sure
why tho.
How can I see how many files are behind?

Cheers
Patrick.


Re: [GENERAL] Sequential parallel scan / agg

2016-11-13 Thread James Sewell
OK, looks like this question is answered by the wiki:

https://wiki.postgresql.org/wiki/Parallel_Query

Even when parallel query is enabled in general, the query planner will
> never generate a parallel plan if any of the following are true:
>
>- The query writes data. If a query contains a data-modifying
>operation either at the top level or within a CTE, no parallel plans for
>that query will be generated. This is a limitation of the current
>implementation which could be lifted in a future release.
>
>
Sad, but looks to hold from this testing!

James Sewell,
PostgreSQL Team Lead / Solutions Architect



Suite 112, Jones Bay Wharf, 26-32 Pirrama Road, Pyrmont NSW 2009
*P *(+61) 2 8099 9000 <(+61)%202%208099%209000>  *W* www.jirotech.com  *F *
(+61) 2 8099 9099 <(+61)%202%208099%209000>

On Mon, Nov 14, 2016 at 11:13 AM, James Sewell 
wrote:

> Hi,
>
> Yes, same result (non-parallel seq scan)
>
> Cheers,
>
> James Sewell,
> PostgreSQL Team Lead / Solutions Architect
>
>
>
> Suite 112, Jones Bay Wharf, 26-32 Pirrama Road, Pyrmont NSW 2009
> *P *(+61) 2 8099 9000 <(+61)%202%208099%209000>  *W* www.jirotech.com
> *F *(+61) 2 8099 9099 <(+61)%202%208099%209000>
>
> On Mon, Nov 14, 2016 at 10:43 AM, melvin6925  wrote:
>
>> Have you tried:
>> CREATE TABLE blah AS SELECT ...  ?
>>
>>
>>
>> Sent via the Samsung Galaxy S® 6, an AT 4G LTE smartphone
>>
>>  Original message 
>> From: James Sewell 
>> Date: 11/13/16 18:04 (GMT-05:00)
>> To: pgsql-general 
>> Subject: [GENERAL] Sequential parallel scan / agg
>>
>> Hi all,
>>
>> I have a customer who is making use of the new parallel features on a 64
>> core box - this is working really well for them. However one issue (which
>> I've since realised I've never seen an example of) is getting the data from
>> these queries into a table.
>>
>> I can't seem to create a plan which does this and still uses multiple
>> workers, probably best to just show by example.
>>
>> Below you can see:
>>
>>1. *INSERT ... SELECT *doesn't work
>>2. *WITH a AS (SELECT ...) INSERT FROM* a doesn't work
>>3. *SELECT .. INTO* doesn't work
>>4. *SELECT* with no insert does work as expected
>>
>>
>>
>> test=# explain
>> insert into jirotech.test
>> (select count(*)  FROM  meter_read mr
>>
>> where TIMESTAMP_ROUNDED >= (timestamp'2016-03-30 00:00:00' - interval '2'
>> day)
>>
>> and TIMESTAMP_ROUNDED < timestamp'2016-03-30 00:00:00' )
>> ;
>>
>>QUERY PLAN
>> 
>> 
>> 
>>  Insert on test  (cost=13336866.41..13336866.44 rows=1 width=438)
>>->  Subquery Scan on "*SELECT*"  (cost=13336866.41..13336866.44
>> rows=1 width=438)
>>  ->  Aggregate  (cost=13336866.41..13336866.42 rows=1 width=8)
>>->  Append  (cost=0.00..12405302.12 rows=372625715 width=0)
>>  ->  Seq Scan on meter_read mr  (cost=0.00..0.00
>> rows=1 width=0)
>>Filter: ((timestamp_rounded >= '2016-03-28
>> 00:00:00'::timestamp without time zone) AND (timestamp_rounded <
>> '2016-03-30 00:00:00'::timestamp without time zone))
>>  ->  Seq Scan on meter_read_2016_03_29 mr_1
>>  (cost=0.00..6201295.24 rows=186240748 width=0)
>>Filter: ((timestamp_rounded >= '2016-03-28
>> 00:00:00'::timestamp without time zone) AND (timestamp_rounded <
>> '2016-03-30 00:00:00'::timestamp without time zone))
>>  ->  Seq Scan on meter_read_2016_03_28 mr_2
>>  (cost=0.00..6204006.88 rows=186384966 width=0)
>>Filter: ((timestamp_rounded >= '2016-03-28
>> 00:00:00'::timestamp without time zone) AND (timestamp_rounded <
>> '2016-03-30 00:00:00'::timestamp without time zone))
>> (10 rows)
>>
>> test=# explain
>> with a as
>> (select count(*)  FROM  meter_read mr
>>
>> where TIMESTAMP_ROUNDED >= (timestamp'2016-03-30 00:00:00' - interval '2'
>> day)
>>
>> and TIMESTAMP_ROUNDED < timestamp'2016-03-30 00:00:00'  )
>> INSERT INTO jirotech.test SELECT * FROM A
>> ;
>>
>>
>>   QUERY PLAN
>> 
>> 
>> 
>>  Insert on test  (cost=13336866.42..13336866.45 rows=1 width=438)
>>CTE a
>>  ->  Aggregate  (cost=13336866.41..13336866.42 rows=1 width=8)
>>->  Append  (cost=0.00..12405302.12 rows=372625715 width=0)
>>  ->  Seq Scan on meter_read mr  (cost=0.00..0.00 rows=1
>> width=0)
>>Filter: ((timestamp_rounded >= '2016-03-28
>> 00:00:00'::timestamp without time zone) AND (timestamp_rounded <
>> '2016-03-30 

Re: [GENERAL] Sequential parallel scan / agg

2016-11-13 Thread James Sewell
Hi,

Yes, same result (non-parallel seq scan)

Cheers,

James Sewell,
PostgreSQL Team Lead / Solutions Architect



Suite 112, Jones Bay Wharf, 26-32 Pirrama Road, Pyrmont NSW 2009
*P *(+61) 2 8099 9000 <(+61)%202%208099%209000>  *W* www.jirotech.com  *F *
(+61) 2 8099 9099 <(+61)%202%208099%209000>

On Mon, Nov 14, 2016 at 10:43 AM, melvin6925  wrote:

> Have you tried:
> CREATE TABLE blah AS SELECT ...  ?
>
>
>
> Sent via the Samsung Galaxy S® 6, an AT 4G LTE smartphone
>
>  Original message 
> From: James Sewell 
> Date: 11/13/16 18:04 (GMT-05:00)
> To: pgsql-general 
> Subject: [GENERAL] Sequential parallel scan / agg
>
> Hi all,
>
> I have a customer who is making use of the new parallel features on a 64
> core box - this is working really well for them. However one issue (which
> I've since realised I've never seen an example of) is getting the data from
> these queries into a table.
>
> I can't seem to create a plan which does this and still uses multiple
> workers, probably best to just show by example.
>
> Below you can see:
>
>1. *INSERT ... SELECT *doesn't work
>2. *WITH a AS (SELECT ...) INSERT FROM* a doesn't work
>3. *SELECT .. INTO* doesn't work
>4. *SELECT* with no insert does work as expected
>
>
>
> test=# explain
> insert into jirotech.test
> (select count(*)  FROM  meter_read mr
>
> where TIMESTAMP_ROUNDED >= (timestamp'2016-03-30 00:00:00' - interval '2'
> day)
>
> and TIMESTAMP_ROUNDED < timestamp'2016-03-30 00:00:00' )
> ;
>
>QUERY PLAN
> 
> 
> 
>  Insert on test  (cost=13336866.41..13336866.44 rows=1 width=438)
>->  Subquery Scan on "*SELECT*"  (cost=13336866.41..13336866.44 rows=1
> width=438)
>  ->  Aggregate  (cost=13336866.41..13336866.42 rows=1 width=8)
>->  Append  (cost=0.00..12405302.12 rows=372625715 width=0)
>  ->  Seq Scan on meter_read mr  (cost=0.00..0.00
> rows=1 width=0)
>Filter: ((timestamp_rounded >= '2016-03-28
> 00:00:00'::timestamp without time zone) AND (timestamp_rounded <
> '2016-03-30 00:00:00'::timestamp without time zone))
>  ->  Seq Scan on meter_read_2016_03_29 mr_1
>  (cost=0.00..6201295.24 rows=186240748 width=0)
>Filter: ((timestamp_rounded >= '2016-03-28
> 00:00:00'::timestamp without time zone) AND (timestamp_rounded <
> '2016-03-30 00:00:00'::timestamp without time zone))
>  ->  Seq Scan on meter_read_2016_03_28 mr_2
>  (cost=0.00..6204006.88 rows=186384966 width=0)
>Filter: ((timestamp_rounded >= '2016-03-28
> 00:00:00'::timestamp without time zone) AND (timestamp_rounded <
> '2016-03-30 00:00:00'::timestamp without time zone))
> (10 rows)
>
> test=# explain
> with a as
> (select count(*)  FROM  meter_read mr
>
> where TIMESTAMP_ROUNDED >= (timestamp'2016-03-30 00:00:00' - interval '2'
> day)
>
> and TIMESTAMP_ROUNDED < timestamp'2016-03-30 00:00:00'  )
> INSERT INTO jirotech.test SELECT * FROM A
> ;
>
>
>   QUERY PLAN
> 
> 
> 
>  Insert on test  (cost=13336866.42..13336866.45 rows=1 width=438)
>CTE a
>  ->  Aggregate  (cost=13336866.41..13336866.42 rows=1 width=8)
>->  Append  (cost=0.00..12405302.12 rows=372625715 width=0)
>  ->  Seq Scan on meter_read mr  (cost=0.00..0.00 rows=1
> width=0)
>Filter: ((timestamp_rounded >= '2016-03-28
> 00:00:00'::timestamp without time zone) AND (timestamp_rounded <
> '2016-03-30 00:00:00'::timestamp without time zone))
>  ->  Seq Scan on meter_read_2016_03_29 mr_1
>  (cost=0.00..6201295.24 rows=186240748 width=0)
>Filter: ((timestamp_rounded >= '2016-03-28
> 00:00:00'::timestamp without time zone) AND (timestamp_rounded <
> '2016-03-30 00:00:00'::timestamp without time zone))
>  ->  Seq Scan on meter_read_2016_03_28 mr_2
>  (cost=0.00..6204006.88 rows=186384966 width=0)
>Filter: ((timestamp_rounded >= '2016-03-28
> 00:00:00'::timestamp without time zone) AND (timestamp_rounded <
> '2016-03-30 00:00:00'::timestamp without time zone))
>->  CTE Scan on a  (cost=0.00..0.03 rows=1 width=438)
> (11 rows)
>
> test=# explain
> select count(*) into jirotech.jamestest FROM  meter_read mr
>
> where TIMESTAMP_ROUNDED >= (timestamp'2016-03-30 00:00:00' - interval '2'
> day)
>
> and TIMESTAMP_ROUNDED < timestamp'2016-03-30 00:00:00'
> ;
>
>  QUERY PLAN
> 

Re: [GENERAL] Sequential parallel scan / agg

2016-11-13 Thread melvin6925
Have you tried:CREATE TABLE blah AS SELECT ...  ?


Sent via the Samsung Galaxy S® 6, an AT 4G LTE smartphone
 Original message From: James Sewell 
 Date: 11/13/16  18:04  (GMT-05:00) To: 
pgsql-general  Subject: [GENERAL] Sequential 
parallel scan / agg 
Hi all,
I have a customer who is making use of the new parallel features on a 64 core 
box - this is working really well for them. However one issue (which I've since 
realised I've never seen an example of) is getting the data from these queries 
into a table.
I can't seem to create a plan which does this and still uses multiple workers, 
probably best to just show by example.
Below you can see:INSERT ... SELECT doesn't workWITH a AS (SELECT ...) INSERT 
FROM a doesn't workSELECT .. INTO doesn't workSELECT with no insert does work 
as expected

test=# explaininsert into jirotech.test(select count(*)  FROM  meter_read mr
where TIMESTAMP_ROUNDED >= (timestamp'2016-03-30 00:00:00' - interval '2' day)
and TIMESTAMP_ROUNDED < timestamp'2016-03-30 00:00:00' );                       
                                                                  QUERY PLAN    
                    

 Insert on test  (cost=13336866.41..13336866.44 rows=1 width=438)   ->  
Subquery Scan on "*SELECT*"  (cost=13336866.41..13336866.44 rows=1 width=438)   
      ->  Aggregate  (cost=13336866.41..13336866.42 rows=1 width=8)             
  ->  Append  (cost=0.00..12405302.12 rows=372625715 width=0)                   
  ->  Seq Scan on meter_read mr  (cost=0.00..0.00 rows=1 width=0)               
            Filter: ((timestamp_rounded >= '2016-03-28 00:00:00'::timestamp 
without time zone) AND (timestamp_rounded < '2016-03-30 00:00:00'::timestamp 
without time zone))                     ->  Seq Scan on meter_read_2016_03_29 
mr_1  (cost=0.00..6201295.24 rows=186240748 width=0)                           
Filter: ((timestamp_rounded >= '2016-03-28 00:00:00'::timestamp without time 
zone) AND (timestamp_rounded < '2016-03-30 00:00:00'::timestamp without time 
zone))                     ->  Seq Scan on meter_read_2016_03_28 mr_2  
(cost=0.00..6204006.88 rows=186384966 width=0)                           
Filter: ((timestamp_rounded >= '2016-03-28 00:00:00'::timestamp without time 
zone) AND (timestamp_rounded < '2016-03-30 00:00:00'::timestamp without time 
zone))(10 rows)
test=# explainwith a as(select count(*)  FROM  meter_read mr
where TIMESTAMP_ROUNDED >= (timestamp'2016-03-30 00:00:00' - interval '2' day)
and TIMESTAMP_ROUNDED < timestamp'2016-03-30 00:00:00'  )INSERT INTO 
jirotech.test SELECT * FROM A;                                                  
                                    QUERY PLAN                          

 Insert on test  (cost=13336866.42..13336866.45 rows=1 width=438)   CTE a     
->  Aggregate  (cost=13336866.41..13336866.42 rows=1 width=8)           ->  
Append  (cost=0.00..12405302.12 rows=372625715 width=0)                 ->  Seq 
Scan on meter_read mr  (cost=0.00..0.00 rows=1 width=0)                       
Filter: ((timestamp_rounded >= '2016-03-28 00:00:00'::timestamp without time 
zone) AND (timestamp_rounded < '2016-03-30 00:00:00'::timestamp without time 
zone))                 ->  Seq Scan on meter_read_2016_03_29 mr_1  
(cost=0.00..6201295.24 rows=186240748 width=0)                       Filter: 
((timestamp_rounded >= '2016-03-28 00:00:00'::timestamp without time zone) AND 
(timestamp_rounded < '2016-03-30 00:00:00'::timestamp without time zone))       
          ->  Seq Scan on meter_read_2016_03_28 mr_2  (cost=0.00..6204006.88 
rows=186384966 width=0)                       Filter: ((timestamp_rounded >= 
'2016-03-28 00:00:00'::timestamp without time zone) AND (timestamp_rounded < 
'2016-03-30 00:00:00'::timestamp without time zone))   ->  CTE Scan on a  
(cost=0.00..0.03 rows=1 width=438)(11 rows)
test=# explainselect count(*) into jirotech.jamestest FROM  meter_read mr
where TIMESTAMP_ROUNDED >= (timestamp'2016-03-30 00:00:00' - interval '2' day)
and TIMESTAMP_ROUNDED < timestamp'2016-03-30 00:00:00';                         
                                                          QUERY PLAN            
                  

 Aggregate  (cost=13336866.41..13336866.42 rows=1 width=8)   ->  Append  
(cost=0.00..12405302.12 rows=372625715 width=0)         ->  Seq Scan on 
meter_read mr  (cost=0.00..0.00 rows=1 width=0)               

[GENERAL] Sequential parallel scan / agg

2016-11-13 Thread James Sewell
Hi all,

I have a customer who is making use of the new parallel features on a 64
core box - this is working really well for them. However one issue (which
I've since realised I've never seen an example of) is getting the data from
these queries into a table.

I can't seem to create a plan which does this and still uses multiple
workers, probably best to just show by example.

Below you can see:

   1. *INSERT ... SELECT *doesn't work
   2. *WITH a AS (SELECT ...) INSERT FROM* a doesn't work
   3. *SELECT .. INTO* doesn't work
   4. *SELECT* with no insert does work as expected



test=# explain
insert into jirotech.test
(select count(*)  FROM  meter_read mr

where TIMESTAMP_ROUNDED >= (timestamp'2016-03-30 00:00:00' - interval '2'
day)

and TIMESTAMP_ROUNDED < timestamp'2016-03-30 00:00:00' )
;

 QUERY PLAN

 Insert on test  (cost=13336866.41..13336866.44 rows=1 width=438)
   ->  Subquery Scan on "*SELECT*"  (cost=13336866.41..13336866.44 rows=1
width=438)
 ->  Aggregate  (cost=13336866.41..13336866.42 rows=1 width=8)
   ->  Append  (cost=0.00..12405302.12 rows=372625715 width=0)
 ->  Seq Scan on meter_read mr  (cost=0.00..0.00 rows=1
width=0)
   Filter: ((timestamp_rounded >= '2016-03-28
00:00:00'::timestamp without time zone) AND (timestamp_rounded <
'2016-03-30 00:00:00'::timestamp without time zone))
 ->  Seq Scan on meter_read_2016_03_29 mr_1
 (cost=0.00..6201295.24 rows=186240748 width=0)
   Filter: ((timestamp_rounded >= '2016-03-28
00:00:00'::timestamp without time zone) AND (timestamp_rounded <
'2016-03-30 00:00:00'::timestamp without time zone))
 ->  Seq Scan on meter_read_2016_03_28 mr_2
 (cost=0.00..6204006.88 rows=186384966 width=0)
   Filter: ((timestamp_rounded >= '2016-03-28
00:00:00'::timestamp without time zone) AND (timestamp_rounded <
'2016-03-30 00:00:00'::timestamp without time zone))
(10 rows)

test=# explain
with a as
(select count(*)  FROM  meter_read mr

where TIMESTAMP_ROUNDED >= (timestamp'2016-03-30 00:00:00' - interval '2'
day)

and TIMESTAMP_ROUNDED < timestamp'2016-03-30 00:00:00'  )
INSERT INTO jirotech.test SELECT * FROM A
;


QUERY PLAN

 Insert on test  (cost=13336866.42..13336866.45 rows=1 width=438)
   CTE a
 ->  Aggregate  (cost=13336866.41..13336866.42 rows=1 width=8)
   ->  Append  (cost=0.00..12405302.12 rows=372625715 width=0)
 ->  Seq Scan on meter_read mr  (cost=0.00..0.00 rows=1
width=0)
   Filter: ((timestamp_rounded >= '2016-03-28
00:00:00'::timestamp without time zone) AND (timestamp_rounded <
'2016-03-30 00:00:00'::timestamp without time zone))
 ->  Seq Scan on meter_read_2016_03_29 mr_1
 (cost=0.00..6201295.24 rows=186240748 width=0)
   Filter: ((timestamp_rounded >= '2016-03-28
00:00:00'::timestamp without time zone) AND (timestamp_rounded <
'2016-03-30 00:00:00'::timestamp without time zone))
 ->  Seq Scan on meter_read_2016_03_28 mr_2
 (cost=0.00..6204006.88 rows=186384966 width=0)
   Filter: ((timestamp_rounded >= '2016-03-28
00:00:00'::timestamp without time zone) AND (timestamp_rounded <
'2016-03-30 00:00:00'::timestamp without time zone))
   ->  CTE Scan on a  (cost=0.00..0.03 rows=1 width=438)
(11 rows)

test=# explain
select count(*) into jirotech.jamestest FROM  meter_read mr

where TIMESTAMP_ROUNDED >= (timestamp'2016-03-30 00:00:00' - interval '2'
day)

and TIMESTAMP_ROUNDED < timestamp'2016-03-30 00:00:00'
;

   QUERY PLAN

 Aggregate  (cost=13336866.41..13336866.42 rows=1 width=8)
   ->  Append  (cost=0.00..12405302.12 rows=372625715 width=0)
 ->  Seq Scan on meter_read mr  (cost=0.00..0.00 rows=1 width=0)
   Filter: ((timestamp_rounded >= '2016-03-28
00:00:00'::timestamp without time zone) AND (timestamp_rounded <
'2016-03-30 00:00:00'::timestamp without time zone))
 ->  Seq Scan on meter_read_2016_03_29 mr_1  (cost=0.00..6201295.24
rows=186240748 width=0)
   Filter: ((timestamp_rounded >= '2016-03-28
00:00:00'::timestamp without time zone) AND (timestamp_rounded <
'2016-03-30 00:00:00'::timestamp without time zone))
 ->  Seq Scan on meter_read_2016_03_28 mr_2  (cost=0.00..6204006.88
rows=186384966 width=0)
   Filter: ((timestamp_rounded >= '2016-03-28

Re: [GENERAL] Gin indexes on intarray is fast when value in array does not exists, and slow, when value exists

2016-11-13 Thread Jeff Janes
On Sat, Nov 12, 2016 at 5:33 PM, otar shavadze  wrote:

> Tried
>
>   OPERATOR(pg_catalog.@>)
>
>
>  as Tom mentioned, but still, don't get fast performance when value does
> not existed in any array.
>

Did you build the correct index?


>
> Also "played" with many   different ways, gin, gist indexes (gin with and
> without *gin__int_ops*)  but, always, there was some situation, where
> search in array was slow.
>

Yes.  There will always be some situation when the array search is slow.
Is that situation one that a specific person cares about?  Hard to tell,
since you have not given us any additional useful information.

I don't know exactly, may be I am wrong, but what I understood after
> several day "trying", is that, I never will use arrays, with tables more
> than 500 000-1000 000 rows, because then  searching in this array is
> somehow problematic.
>
> I rebuild my structure and added another table (instead of using array)
> and then used join's instead of searching in array.
>
> That's works perfectly,   joining works fast as hell, even for several
> millions rows in each table.
>

"Properly" normalizing your data is a wonderful thing, no doubt about it,
if you are prepared to deal with the consequences of doing so.  But not
everyone has that luxury.  Which is why there is more than one way of doing
things.

Cheers,

Jeff


Re: [GENERAL] pg_dumpall: could not connect to database "template1": FATAL:

2016-11-13 Thread aws backup
thank you.
it listens to the local network, which are my two other MacBooks. I reported 
the bug to DaVinci. Hopefully the have a fix then I can put it back on md5.

thank you for your script.
will try it.

best
Robert


> On 13 Nov 2016, at 23:27, John R Pierce  wrote:
> 
> On 11/13/2016 1:01 PM, aws backup wrote:
>> thank you so much.
>> With your help I could solve all my problems.
>> DaVinci has a bug somewhere with the database configuration.
>> I installed everything new and set all auth method to trust instead of md5 
>> which was default from DaVinci.
> 
> trust is not very secure as it means any process on your system has full 
> access to any database user including the postgres database administrator.  
> hopefully this database server only has listen_addresses='localhost'   and 
> isn't listening on external network interfaces, otherwsie trust means any 
> machine on your network can access it.
> 
> 
> 
>> Now everything is working as expected. No errors anymore with the backup and 
>> restore buttons in DaVinci which are just linked to the pg_dump and 
>> pg_restore scripts.
>> The pg_dumpall approach was from the example scripts which are posted in the 
>> DaVinci Forum.
>> But this approach doesn't make sense … ?
> 
> on a Linux system, my pg backup script looks like...
> 
> 
> #!/bin/bash
> ## postgres backup script
> DD=$(date +%a)
> PP=/backups/pgsql/
> 
> pg_dumpall -g -f $PP/pg_dumpall.$p.globals-$DD.sql
> for db in $(psql -tc "select datname from pg_database where not 
> datistemplate"); do
>   pg_dump -Fc -f $PP/pgdump.$p.$db.$DD.dump -d $db
> done
> 
> 
> this generates a globals backup file, and a compressed backup from each 
> non-system database.   these compressed backups are restorable with the 
> pg_restore command which has a lot of useful options (restore schema only, or 
> restore data only, etc etc).   It also puts the day-of-the-week in the 
> filename of each of these backups (thats what the DD value is for), so if you 
> run this daily, you end up with 7 sets of files, one for each day of the 
> week.   if you change %a to %F, then the filenames will instead contain like 
> 2016-11-13 ...
> 
> 
> -- 
> john r pierce, recycling bits in santa cruz
> 
> 
> 
> -- 
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Trigram is slow when 10m rows

2016-11-13 Thread Jeff Janes
On Sun, Nov 13, 2016 at 3:54 AM, Aaron Lewis 
wrote:

> I have a simple table with Trigram index,
>
> create table mytable(hash char(40), title text);
> create index title_trgm_idx on mytable using gin(title gin_trgm_ops);
>
> When I run a query with 10m rows, it uses the Trigram index, but takes
> 3s to execute, very slow.
> (I have 80m rows, but only inserted 10m for testing purpose)
>
> test=# select count(*) from mytable;
>   count
> --
>  13971887
> (1 row)
>
> test=# explain analyze select * from mytable where title ilike 'x264';
>   QUERY PLAN
> 
> --
>  Bitmap Heap Scan on mytable  (cost=462.69..5639.67 rows=1380
> width=83) (actual time=2937.308..2937.308 rows=0 loops=1)
>Recheck Cond: (title ~~* 'x264'::text)
>Rows Removed by Index Recheck: 11402855
>Heap Blocks: exact=39557 lossy=158010
>->  Bitmap Index Scan on title_trgm_idx  (cost=0.00..462.35
> rows=1380 width=0) (actual time=342.440..342.440 rows=1220793 loops=1)
>  Index Cond: (title ~~* 'x264'::text)
>  Planning time: 0.611 ms
>  Execution time: 2937.729 ms
> (8 rows)
>
> Any ideas to speed things up?
>


What version of postgresql and pg_trgm are you using?  It might work better
under 9.6/1.3

Since your query doesn't use wildcards, it is probably more well suited to
a regular btree index, perhaps with citext.

Cheers,

Jeff


Re: [GENERAL] pg_dumpall: could not connect to database "template1": FATAL:

2016-11-13 Thread aws backup
Thank you, I understand. Nobody else can access the database.
As the database backup button in DaVinci started to work I could understand 
that DaVinci is only making pg_dump of the database. The restore works fine 
from any DaVinci installation to another one. I tested this. I still can make a 
automatic daily shell script with a pg_dumpall. Will see.

Thank you.


> On 13 Nov 2016, at 23:12, Adrian Klaver  wrote:
> 
> On 11/13/2016 01:01 PM, aws backup wrote:
>> Hi,
>> 
>> thank you so much.
>> With your help I could solve all my problems.
>> DaVinci has a bug somewhere with the database configuration.
>> I installed everything new and set all auth method to trust instead of md5 
>> which was default from DaVinci.
> 
> Just remember that trust is just that, trust. If someone knows a valid 
> username they have access, subject to other conditions set in pg_hba.conf.
> 
>> Now everything is working as expected. No errors anymore with the backup and 
>> restore buttons in DaVinci which are just linked to the pg_dump and 
>> pg_restore scripts.
>> The pg_dumpall approach was from the example scripts which are posted in the 
>> DaVinci Forum.
>> But this approach doesn't make sense … ?
> 
> It does if you want to backup the state of the entire cluster. Could be that 
> DaVinci is including things in the template1 and postgres databases that are 
> needed. The other thing that pg_dumpall gets you is data global to the 
> cluster:
> 
> https://www.postgresql.org/docs/9.5/static/app-pg-dumpall.html 
> 
> 
> " ...  pg_dumpall also dumps global objects that are common to all databases. 
> (pg_dump does not save these objects.) ..."
> 
> Though you can have your cake and eat it to by using pg_dump for individual 
> databases and then
> 
> pg_dumpall -g
> 
> where -g is:
> 
> "-g
> --globals-only
> 
>Dump only global objects (roles and tablespaces), no databases.
> "
> 
>> 
>> Best Regards,
>> Robert
>> 
>> 
>> 
>>> On 13 Nov 2016, at 19:13, Adrian Klaver  wrote:
>>> 
>>> On 11/13/2016 09:04 AM, aws backup wrote:
 Hi Adrian,
 
 thank you for the explanation.
 I will look into you links.
 
 I am doing this because I want to make backups from the
 database. Ideally automatic backups every night.
 The Blackmagic support says they can't help me with this. The Backup and
 Restore button in DaVinci does not work.
 Every database related question I ask the Blackmagic support stays
 unanswered.
>>> 
>>> Guessing they are treating the Postgres database as an embedded one that 
>>> should not be touched by the end user.
>>> 
 For example: How can I restart the SQL server? Oh sorry we can't help
 you with this … ?
>>> 
>>> https://www.postgresql.org/docs/9.5/static/app-pg-ctl.html
>>> 
 
 Thank you.
 Robert
 
 
 
 
>>> --
>>> Adrian Klaver
>>> adrian.kla...@aklaver.com
>> 
>> 
> 
> 
> -- 
> Adrian Klaver
> adrian.kla...@aklaver.com 


Re: [GENERAL] Exclude pg_largeobject form pg_dump

2016-11-13 Thread Guillaume Lelarge
2016-11-08 6:01 GMT+01:00 amul sul :

> On Tue, Nov 8, 2016 at 5:36 AM, Andreas Joseph Krogh 
> wrote:
> >
> >
> > I don't see what you mean. It forces dump of Blobs if we didn't use -B
> and
> > if we include everything in the dump, which seems good to me. What did
> you
> > try that didn't work as expected?
> >
> >
> > I guess what he means is that if -B is given, the following code sets
> > dopt.outputBlobs = false
> >
> > +case 'B':/* Don't dump blobs */
> > +dopt.outputBlobs = false;
> > +break;
> >
> >
> > Then this IF sets it back to TRUE:
> >
> > +if (dopt.include_everything && !dopt.schemaOnly &&
> !dopt.outputBlobs)
> >  dopt.outputBlobs = true;
> >
> >
> > ...making it impossible to turn off dumping of blobs.
> >
>
> Yes, thats the reason v4 patch  was not as expected.
>
>
It took me some time but I finally understand.

Behaviour fix in v6.


-- 
Guillaume.
  http://blog.guillaume.lelarge.info
  http://www.dalibo.com
diff --git a/doc/src/sgml/ref/pg_dump.sgml b/doc/src/sgml/ref/pg_dump.sgml
index 371a614..a891b6e 100644
--- a/doc/src/sgml/ref/pg_dump.sgml
+++ b/doc/src/sgml/ref/pg_dump.sgml
@@ -145,6 +145,21 @@ PostgreSQL documentation
  
 
  
+  -B
+  --no-blobs
+  
+   
+Exclude large objects in the dump.
+   
+
+   
+When both -b and -B are given, the behavior
+is to output large objects.
+   
+  
+ 
+
+ 
   -c
   --clean
   
diff --git a/src/bin/pg_dump/pg_backup.h b/src/bin/pg_dump/pg_backup.h
index 0a28124..00ecde3 100644
--- a/src/bin/pg_dump/pg_backup.h
+++ b/src/bin/pg_dump/pg_backup.h
@@ -158,6 +158,7 @@ typedef struct _dumpOptions
 	int			outputClean;
 	int			outputCreateDB;
 	bool		outputBlobs;
+	bool		dontOutputBlobs;
 	int			outputNoOwner;
 	char	   *outputSuperuser;
 } DumpOptions;
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 4da297f..a91cbd9 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -291,6 +291,7 @@ main(int argc, char **argv)
 	static struct option long_options[] = {
 		{"data-only", no_argument, NULL, 'a'},
 		{"blobs", no_argument, NULL, 'b'},
+		{"no-blobs", no_argument, NULL, 'B'},
 		{"clean", no_argument, NULL, 'c'},
 		{"create", no_argument, NULL, 'C'},
 		{"dbname", required_argument, NULL, 'd'},
@@ -379,7 +380,7 @@ main(int argc, char **argv)
 
 	InitDumpOptions();
 
-	while ((c = getopt_long(argc, argv, "abcCd:E:f:F:h:j:n:N:oOp:RsS:t:T:U:vwWxZ:",
+	while ((c = getopt_long(argc, argv, "abBcCd:E:f:F:h:j:n:N:oOp:RsS:t:T:U:vwWxZ:",
 			long_options, )) != -1)
 	{
 		switch (c)
@@ -392,6 +393,10 @@ main(int argc, char **argv)
 dopt.outputBlobs = true;
 break;
 
+			case 'B':			/* Don't dump blobs */
+dopt.dontOutputBlobs = true;
+break;
+
 			case 'c':			/* clean (i.e., drop) schema prior to create */
 dopt.outputClean = 1;
 break;
@@ -708,7 +713,7 @@ main(int argc, char **argv)
 	 * Dumping blobs is now default unless we saw an inclusion switch or -s
 	 * ... but even if we did see one of these, -b turns it back on.
 	 */
-	if (dopt.include_everything && !dopt.schemaOnly)
+	if (dopt.include_everything && !dopt.schemaOnly && !dopt.dontOutputBlobs)
 		dopt.outputBlobs = true;
 
 	/*
@@ -864,6 +869,7 @@ help(const char *progname)
 	printf(_("\nOptions controlling the output content:\n"));
 	printf(_("  -a, --data-only  dump only the data, not the schema\n"));
 	printf(_("  -b, --blobs  include large objects in dump\n"));
+	printf(_("  -B, --no-blobs   exclude large objects in dump\n"));
 	printf(_("  -c, --clean  clean (drop) database objects before recreating\n"));
 	printf(_("  -C, --create include commands to create database in dump\n"));
 	printf(_("  -E, --encoding=ENCODING  dump the data in encoding ENCODING\n"));

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] pg_dumpall: could not connect to database "template1": FATAL:

2016-11-13 Thread Adrian Klaver

On 11/13/2016 01:01 PM, aws backup wrote:

Hi,

thank you so much.
With your help I could solve all my problems.
DaVinci has a bug somewhere with the database configuration.
I installed everything new and set all auth method to trust instead of md5 
which was default from DaVinci.


Just remember that trust is just that, trust. If someone knows a valid 
username they have access, subject to other conditions set in pg_hba.conf.



Now everything is working as expected. No errors anymore with the backup and 
restore buttons in DaVinci which are just linked to the pg_dump and pg_restore 
scripts.
The pg_dumpall approach was from the example scripts which are posted in the 
DaVinci Forum.
But this approach doesn't make sense … ?


It does if you want to backup the state of the entire cluster. Could be 
that DaVinci is including things in the template1 and postgres databases 
that are needed. The other thing that pg_dumpall gets you is data global 
to the cluster:


https://www.postgresql.org/docs/9.5/static/app-pg-dumpall.html

" ...  pg_dumpall also dumps global objects that are common to all 
databases. (pg_dump does not save these objects.) ..."


Though you can have your cake and eat it to by using pg_dump for 
individual databases and then


pg_dumpall -g

where -g is:

"-g
--globals-only

Dump only global objects (roles and tablespaces), no databases.
"



Best Regards,
Robert




On 13 Nov 2016, at 19:13, Adrian Klaver  wrote:

On 11/13/2016 09:04 AM, aws backup wrote:

Hi Adrian,

thank you for the explanation.
I will look into you links.

I am doing this because I want to make backups from the
database. Ideally automatic backups every night.
The Blackmagic support says they can't help me with this. The Backup and
Restore button in DaVinci does not work.
Every database related question I ask the Blackmagic support stays
unanswered.


Guessing they are treating the Postgres database as an embedded one that should 
not be touched by the end user.


For example: How can I restart the SQL server? Oh sorry we can't help
you with this … ?


https://www.postgresql.org/docs/9.5/static/app-pg-ctl.html



Thank you.
Robert





--
Adrian Klaver
adrian.kla...@aklaver.com






--
Adrian Klaver
adrian.kla...@aklaver.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Trigram is slow when 10m rows

2016-11-13 Thread Oleg Bartunov
On Sun, Nov 13, 2016 at 5:25 PM, Aaron Lewis 
wrote:

> Thanks Oleg.
>
> I've increased work_mem to 128MB, now the query falls down to 1.7s,
> faster but still not good enough.
>
> Is there any other thing I can do about it?
>

your query 'x264' is short in terms of the number of trigrams, so trigram
index isn't good. Did you tried text_pattern_ops for btree ? Something like
create index title_btree_idx on mytable using btree(title text_pattern_ops
);

>
> test=# explain analyze select * from mytable where title ilike 'x264';
>   QUERY PLAN
> 
> --
>  Bitmap Heap Scan on mytable  (cost=462.69..5639.67 rows=1380
> width=83) (actual time=1754.656..1754.656 rows=0 loops=1)
>Recheck Cond: (title ~~* 'x264'::text)
>Rows Removed by Index Recheck: 1220793
>Heap Blocks: exact=197567
>->  Bitmap Index Scan on title_trgm_idx  (cost=0.00..462.35
> rows=1380 width=0) (actual time=346.663..346.663 rows=1220793 loops=1)
>  Index Cond: (title ~~* 'x264'::text)
>  Planning time: 1.168 ms
>  Execution time: 1755.944 ms
>
>
> On Sun, Nov 13, 2016 at 10:04 PM, Oleg Bartunov 
> wrote:
> >
> >
> > On Sun, Nov 13, 2016 at 2:54 PM, Aaron Lewis  >
> > wrote:
> >>
> >> I have a simple table with Trigram index,
> >>
> >> create table mytable(hash char(40), title text);
> >> create index title_trgm_idx on mytable using gin(title gin_trgm_ops);
> >>
> >> When I run a query with 10m rows, it uses the Trigram index, but takes
> >> 3s to execute, very slow.
> >> (I have 80m rows, but only inserted 10m for testing purpose)
> >>
> >> test=# select count(*) from mytable;
> >>   count
> >> --
> >>  13971887
> >> (1 row)
> >>
> >> test=# explain analyze select * from mytable where title ilike 'x264';
> >>   QUERY PLAN
> >>
> >> 
> --
> >>  Bitmap Heap Scan on mytable  (cost=462.69..5639.67 rows=1380
> >> width=83) (actual time=2937.308..2937.308 rows=0 loops=1)
> >>Recheck Cond: (title ~~* 'x264'::text)
> >>Rows Removed by Index Recheck: 11402855
> >>Heap Blocks: exact=39557 lossy=158010
> >>->  Bitmap Index Scan on title_trgm_idx  (cost=0.00..462.35
> >> rows=1380 width=0) (actual time=342.440..342.440 rows=1220793 loops=1)
> >>  Index Cond: (title ~~* 'x264'::text)
> >>  Planning time: 0.611 ms
> >>  Execution time: 2937.729 ms
> >> (8 rows)
> >>
> >> Any ideas to speed things up?
> >
> >
> >Rows Removed by Index Recheck: 11402855
> >Heap Blocks: exact=39557 lossy=158010
> >
> > You need to increase work_mem
> >>
> >>
> >> --
> >> Best Regards,
> >> Aaron Lewis - PGP: 0x13714D33 - http://pgp.mit.edu/
> >> Finger Print:   9F67 391B B770 8FF6 99DC  D92D 87F6 2602 1371 4D33
> >>
> >>
> >> --
> >> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> >> To make changes to your subscription:
> >> http://www.postgresql.org/mailpref/pgsql-general
> >
> >
>
>
>
> --
> Best Regards,
> Aaron Lewis - PGP: 0x13714D33 - http://pgp.mit.edu/
> Finger Print:   9F67 391B B770 8FF6 99DC  D92D 87F6 2602 1371 4D33
>


Re: [GENERAL] Why is this query not using GIN index?

2016-11-13 Thread Tom Lane
Oleg Bartunov  writes:
> On Sun, Nov 13, 2016 at 6:05 PM, Aaron Lewis 
>> It takes 500ms with 10m rows, could it be faster?

> sure.  Recheck with function call is pretty expensive, so I'd not recommend
> to create functional index, just create separate column of type tsvector
> (materialize to_tsvector) and create gin index on it.  You should surprise.

I doubt it'll help that much --- more than half the time is going into the
bitmap indexscan, and with over 1m candidate matches, there's no way
that's going to be super cheap.

I wonder whether a gist index would be better here, since it would support
a plain indexscan which should require scanning much less of the index
given the small LIMIT.

(Materializing the tsvector would probably help for gist, too, by reducing
the cost of lossy-index rechecks.)

BTW, it still looks like the performance is being significantly hurt by
inadequate work_mem.

regards, tom lane


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Why is this query not using GIN index?

2016-11-13 Thread Oleg Bartunov
On Sun, Nov 13, 2016 at 6:05 PM, Aaron Lewis 
wrote:

> Sigh, didn't notice that. Thanks for the heads up.
>
> It takes 500ms with 10m rows, could it be faster?
>

sure.  Recheck with function call is pretty expensive, so I'd not recommend
to create functional index, just create separate column of type tsvector
(materialize to_tsvector) and create gin index on it.  You should surprise.


> I've increased work_mem to 256MB
>
> test=# explain analyze select * from mytable where
> to_tsvector('english', title) @@ 'x264'::tsquery limit 1000 offset 10;
>   QUERY PLAN
> 
> 
> ---
>  Limit  (cost=684.06..2949.42 rows=1000 width=83) (actual
> time=348.506..536.483 rows=1000 loops=1)
>->  Bitmap Heap Scan on mytable  (cost=661.41..158917.22 rows=69859
> width=83) (actual time=345.354..536.199 rows=1010 loops=1)
>  Recheck Cond: (to_tsvector('english'::regconfig, title) @@
> '''x264'''::tsquery)
>  Rows Removed by Index Recheck: 12242
>  Heap Blocks: exact=20 lossy=186
>  ->  Bitmap Index Scan on name_fts  (cost=0.00..643.95
> rows=69859 width=0) (actual time=333.703..333.703 rows=1044673
> loops=1)
>Index Cond: (to_tsvector('english'::regconfig, title)
> @@ '''x264'''::tsquery)
>  Planning time: 0.144 ms
>  Execution time: 537.212 ms
> (9 rows)
>
> On Sun, Nov 13, 2016 at 10:33 PM, Julien Rouhaud
>  wrote:
> > On 13/11/2016 15:26, Aaron Lewis wrote:
> >> Hi Oleg,
> >>
> >> Can you elaborate on the title column? I don't get it.
> >>
> >
>  create table mytable(hash char(40), title varchar(500));
>  create index name_fts on mytable using gin(to_tsvector('english',
>  'title'));
> >
> > You created an index on the text 'title', not on the title column, so
> > the index is useless.
> >
> > Drop the existing index and create this one instead:
> >
> > create index name_fts on mytable using gin(to_tsvector('english',
> title));
> >
> >> On Sun, Nov 13, 2016 at 10:10 PM, Oleg Bartunov 
> wrote:
> >>>
> >>>
> >>> On Sun, Nov 13, 2016 at 2:50 PM, Aaron Lewis <
> the.warl0ck.1...@gmail.com>
> >>> wrote:
> 
>  I have a simple table, and a gin index,
> 
>  create table mytable(hash char(40), title varchar(500));
>  create index name_fts on mytable using gin(to_tsvector('english',
>  'title'));
> >>>
> >>>
> >>>
> >>> ^
> >>>
> 
>  create unique index md5_uniq_idx on mytable(hash);
> 
>  When I execute a query with tsquery, the GIN index was not in use:
> 
>  test=# explain analyze select * from mytable where
>  to_tsvector('english', title) @@ 'abc | def'::tsquery limit 10;
>   QUERY PLAN
> 
>  
> 
>   Limit  (cost=0.00..277.35 rows=10 width=83) (actual
>  time=0.111..75.549 rows=10 loops=1)
> ->  Seq Scan on mytable  (cost=0.00..381187.45 rows=13744 width=83)
>  (actual time=0.110..75.546 rows=10 loops=1)
>   Filter: (to_tsvector('english'::regconfig, (title)::text) @@
>  '''abc'' | ''def'''::tsquery)
>   Rows Removed by Filter: 10221
>   Planning time: 0.176 ms
>   Execution time: 75.564 ms
>  (6 rows)
> 
>  Any ideas?
> 
> >
> > --
> > Julien Rouhaud
> > http://dalibo.com - http://dalibo.org
>
>
>
> --
> Best Regards,
> Aaron Lewis - PGP: 0x13714D33 - http://pgp.mit.edu/
> Finger Print:   9F67 391B B770 8FF6 99DC  D92D 87F6 2602 1371 4D33
>


Re: [GENERAL] pg_dumpall: could not connect to database "template1": FATAL:

2016-11-13 Thread John R Pierce

On 11/13/2016 8:51 AM, aws backup wrote:
I assume that the postgres database password is the one which is shown 
in the DaVinci connect database window.


user: postgres
password: DaVinci


there are no database passwords, thats the password for the postgres 
database role/user.




But if I work in the Terminal and change the user "sudo su - postgres" 
I have to use my admin password.


sudo expects the password of the user running sudo.  this has nothing to 
do wih postgres.



That is optional and I sort of doubt the application using one.


You think DaVinci is not using any password … ?


he meant, he doubts DaVinci is using a .pgpass file. The .pgpass 
file is another way to pass your passwords to postgres, but its NOT 
where postgres stores its passwords, those are stored as hashes in the 
pg_catalog.






--
john r pierce, recycling bits in santa cruz



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] pg_dumpall: could not connect to database "template1": FATAL:

2016-11-13 Thread aws backup
Hi Adrian,

thank you for the explanation.
I will look into you links.

I am doing this because I want to make backups from the database. Ideally 
automatic backups every night.
The Blackmagic support says they can't help me with this. The Backup and 
Restore button in DaVinci does not work.
Every database related question I ask the Blackmagic support stays unanswered. 
For example: How can I restart the SQL server? Oh sorry we can't help you with 
this … ?

Thank you.
Robert



> On 13 Nov 2016, at 18:09, Adrian Klaver  wrote:
> 
> On 11/13/2016 05:51 AM, aws backup wrote:
>> Hi,
>> 
>> now I have another problem. Sorry I am an absolute beginner.
>> When I restore the dumpall backup with
>> 
>> "psql -f infile postgres"
>> 
>> I get lot of errors > "already exists" and the database is not restored
>> to the point of the backup.
> 
> Yes, because it seems you are restoring back over existing databases in the 
> cluster. An explanation of terms is in order. When Postgres is installed it 
> init(ialize)s a cluster of databases, template0, template1(the one you had 
> issues with) and postgres. The user then can create additional databases to 
> serve their needs. When you do pg_dumpall you are doing a plain text dump of 
> the entire cluster, which may or may not be what you want. When you do the 
> psql -f above you are trying to restore all the databases in the cluster to 
> the cluster, again something you may or may not want. For finer grained 
> control over the dump/restore cycle you probably want pg_dump.
> 
>> I mean after I made the backup I changed something and I expected that
>> this change is undone after I restore the backup which did not happen.
>> Looks for me like nothing was restored … ?
> 
> You probably should spend some time here:
> 
> https://www.postgresql.org/docs/9.5/static/app-pg-dumpall.html 
> 
> 
> https://www.postgresql.org/docs/9.5/static/app-pgdump.html 
> 
> 
> In the meantime do you really want to start over with the database(s)?
> 
> In other words what is the goal of your dump/restore process?
> 
>> 
>> Thank you for your help.
>> 
>> Regards,
>> Robert
>> 
>> 
>>> On 13 Nov 2016, at 00:37, Scott Marlowe >> 
>>> >> wrote:
>>> 
>>> On Sat, Nov 12, 2016 at 2:31 PM, Adrian Klaver
>>>  
>>> >> 
>>> wrote:
 On 11/12/2016 01:20 PM, aws backup wrote:
> 
> Hi,
> 
> I try to make pg_dumpall backups from a PostgreSQL 9.5 server which is
> part of the DaVinci Resolve 12.5.3 App on a Mac OS X 10.11.6 system.
> 
> Unfortunately I get following failure message:
> 
> pg_dumpall: could not connect to database "template1": FATAL: password
> authentication failed for user "postgres"
> 
> Maybe you can help me to solve this problem.
 
 
 Two choices:
 
 1) Determine what the password is for the postgres user and provide
 it when
 you connect.
 
 2) If you have access to the pg_hba.conf file create a access line
 that uses
 trust as the auth method for user postgres connect that way.
 
 https://www.postgresql.org/docs/9.5/static/auth-methods.html#AUTH-TRUST
>>> 
>>> OR you could use the -l switch and specify another db.
>>> 
>>> pg_dumpall -l mydbnamehere
>> 
> 
> 
> -- 
> Adrian Klaver
> adrian.kla...@aklaver.com 


Re: [GENERAL] pg_dumpall: could not connect to database "template1": FATAL:

2016-11-13 Thread Adrian Klaver

On 11/13/2016 05:51 AM, aws backup wrote:

Hi,

now I have another problem. Sorry I am an absolute beginner.
When I restore the dumpall backup with

"psql -f infile postgres"

I get lot of errors > "already exists" and the database is not restored
to the point of the backup.


Yes, because it seems you are restoring back over existing databases in 
the cluster. An explanation of terms is in order. When Postgres is 
installed it init(ialize)s a cluster of databases, template0, 
template1(the one you had issues with) and postgres. The user then can 
create additional databases to serve their needs. When you do pg_dumpall 
you are doing a plain text dump of the entire cluster, which may or may 
not be what you want. When you do the psql -f above you are trying to 
restore all the databases in the cluster to the cluster, again something 
you may or may not want. For finer grained control over the dump/restore 
cycle you probably want pg_dump.



I mean after I made the backup I changed something and I expected that
this change is undone after I restore the backup which did not happen.
Looks for me like nothing was restored … ?


You probably should spend some time here:

https://www.postgresql.org/docs/9.5/static/app-pg-dumpall.html

https://www.postgresql.org/docs/9.5/static/app-pgdump.html

In the meantime do you really want to start over with the database(s)?

In other words what is the goal of your dump/restore process?



Thank you for your help.

Regards,
Robert



On 13 Nov 2016, at 00:37, Scott Marlowe > wrote:

On Sat, Nov 12, 2016 at 2:31 PM, Adrian Klaver
> wrote:

On 11/12/2016 01:20 PM, aws backup wrote:


Hi,

I try to make pg_dumpall backups from a PostgreSQL 9.5 server which is
part of the DaVinci Resolve 12.5.3 App on a Mac OS X 10.11.6 system.

Unfortunately I get following failure message:

pg_dumpall: could not connect to database "template1": FATAL: password
authentication failed for user "postgres"

Maybe you can help me to solve this problem.



Two choices:

1) Determine what the password is for the postgres user and provide
it when
you connect.

2) If you have access to the pg_hba.conf file create a access line
that uses
trust as the auth method for user postgres connect that way.

https://www.postgresql.org/docs/9.5/static/auth-methods.html#AUTH-TRUST


OR you could use the -l switch and specify another db.

pg_dumpall -l mydbnamehere





--
Adrian Klaver
adrian.kla...@aklaver.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Why is this query not using GIN index?

2016-11-13 Thread Aaron Lewis
Sigh, didn't notice that. Thanks for the heads up.

It takes 500ms with 10m rows, could it be faster?
I've increased work_mem to 256MB

test=# explain analyze select * from mytable where
to_tsvector('english', title) @@ 'x264'::tsquery limit 1000 offset 10;
  QUERY PLAN
---
 Limit  (cost=684.06..2949.42 rows=1000 width=83) (actual
time=348.506..536.483 rows=1000 loops=1)
   ->  Bitmap Heap Scan on mytable  (cost=661.41..158917.22 rows=69859
width=83) (actual time=345.354..536.199 rows=1010 loops=1)
 Recheck Cond: (to_tsvector('english'::regconfig, title) @@
'''x264'''::tsquery)
 Rows Removed by Index Recheck: 12242
 Heap Blocks: exact=20 lossy=186
 ->  Bitmap Index Scan on name_fts  (cost=0.00..643.95
rows=69859 width=0) (actual time=333.703..333.703 rows=1044673
loops=1)
   Index Cond: (to_tsvector('english'::regconfig, title)
@@ '''x264'''::tsquery)
 Planning time: 0.144 ms
 Execution time: 537.212 ms
(9 rows)

On Sun, Nov 13, 2016 at 10:33 PM, Julien Rouhaud
 wrote:
> On 13/11/2016 15:26, Aaron Lewis wrote:
>> Hi Oleg,
>>
>> Can you elaborate on the title column? I don't get it.
>>
>
 create table mytable(hash char(40), title varchar(500));
 create index name_fts on mytable using gin(to_tsvector('english',
 'title'));
>
> You created an index on the text 'title', not on the title column, so
> the index is useless.
>
> Drop the existing index and create this one instead:
>
> create index name_fts on mytable using gin(to_tsvector('english', title));
>
>> On Sun, Nov 13, 2016 at 10:10 PM, Oleg Bartunov  wrote:
>>>
>>>
>>> On Sun, Nov 13, 2016 at 2:50 PM, Aaron Lewis 
>>> wrote:

 I have a simple table, and a gin index,

 create table mytable(hash char(40), title varchar(500));
 create index name_fts on mytable using gin(to_tsvector('english',
 'title'));
>>>
>>>
>>>
>>> ^
>>>

 create unique index md5_uniq_idx on mytable(hash);

 When I execute a query with tsquery, the GIN index was not in use:

 test=# explain analyze select * from mytable where
 to_tsvector('english', title) @@ 'abc | def'::tsquery limit 10;
  QUERY PLAN

 
  Limit  (cost=0.00..277.35 rows=10 width=83) (actual
 time=0.111..75.549 rows=10 loops=1)
->  Seq Scan on mytable  (cost=0.00..381187.45 rows=13744 width=83)
 (actual time=0.110..75.546 rows=10 loops=1)
  Filter: (to_tsvector('english'::regconfig, (title)::text) @@
 '''abc'' | ''def'''::tsquery)
  Rows Removed by Filter: 10221
  Planning time: 0.176 ms
  Execution time: 75.564 ms
 (6 rows)

 Any ideas?

>
> --
> Julien Rouhaud
> http://dalibo.com - http://dalibo.org



-- 
Best Regards,
Aaron Lewis - PGP: 0x13714D33 - http://pgp.mit.edu/
Finger Print:   9F67 391B B770 8FF6 99DC  D92D 87F6 2602 1371 4D33


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Why is this query not using GIN index?

2016-11-13 Thread Julien Rouhaud
On 13/11/2016 15:26, Aaron Lewis wrote:
> Hi Oleg,
> 
> Can you elaborate on the title column? I don't get it.
> 

>>> create table mytable(hash char(40), title varchar(500));
>>> create index name_fts on mytable using gin(to_tsvector('english',
>>> 'title'));

You created an index on the text 'title', not on the title column, so
the index is useless.

Drop the existing index and create this one instead:

create index name_fts on mytable using gin(to_tsvector('english', title));

> On Sun, Nov 13, 2016 at 10:10 PM, Oleg Bartunov  wrote:
>>
>>
>> On Sun, Nov 13, 2016 at 2:50 PM, Aaron Lewis 
>> wrote:
>>>
>>> I have a simple table, and a gin index,
>>>
>>> create table mytable(hash char(40), title varchar(500));
>>> create index name_fts on mytable using gin(to_tsvector('english',
>>> 'title'));
>>
>>
>>
>> ^
>>
>>>
>>> create unique index md5_uniq_idx on mytable(hash);
>>>
>>> When I execute a query with tsquery, the GIN index was not in use:
>>>
>>> test=# explain analyze select * from mytable where
>>> to_tsvector('english', title) @@ 'abc | def'::tsquery limit 10;
>>>  QUERY PLAN
>>>
>>> 
>>>  Limit  (cost=0.00..277.35 rows=10 width=83) (actual
>>> time=0.111..75.549 rows=10 loops=1)
>>>->  Seq Scan on mytable  (cost=0.00..381187.45 rows=13744 width=83)
>>> (actual time=0.110..75.546 rows=10 loops=1)
>>>  Filter: (to_tsvector('english'::regconfig, (title)::text) @@
>>> '''abc'' | ''def'''::tsquery)
>>>  Rows Removed by Filter: 10221
>>>  Planning time: 0.176 ms
>>>  Execution time: 75.564 ms
>>> (6 rows)
>>>
>>> Any ideas?
>>> 

-- 
Julien Rouhaud
http://dalibo.com - http://dalibo.org


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Why is this query not using GIN index?

2016-11-13 Thread Aaron Lewis
Hi Oleg,

Can you elaborate on the title column? I don't get it.

On Sun, Nov 13, 2016 at 10:10 PM, Oleg Bartunov  wrote:
>
>
> On Sun, Nov 13, 2016 at 2:50 PM, Aaron Lewis 
> wrote:
>>
>> I have a simple table, and a gin index,
>>
>> create table mytable(hash char(40), title varchar(500));
>> create index name_fts on mytable using gin(to_tsvector('english',
>> 'title'));
>
>
>
> ^
>
>>
>> create unique index md5_uniq_idx on mytable(hash);
>>
>> When I execute a query with tsquery, the GIN index was not in use:
>>
>> test=# explain analyze select * from mytable where
>> to_tsvector('english', title) @@ 'abc | def'::tsquery limit 10;
>>  QUERY PLAN
>>
>> 
>>  Limit  (cost=0.00..277.35 rows=10 width=83) (actual
>> time=0.111..75.549 rows=10 loops=1)
>>->  Seq Scan on mytable  (cost=0.00..381187.45 rows=13744 width=83)
>> (actual time=0.110..75.546 rows=10 loops=1)
>>  Filter: (to_tsvector('english'::regconfig, (title)::text) @@
>> '''abc'' | ''def'''::tsquery)
>>  Rows Removed by Filter: 10221
>>  Planning time: 0.176 ms
>>  Execution time: 75.564 ms
>> (6 rows)
>>
>> Any ideas?
>>
>>
>> --
>> Best Regards,
>> Aaron Lewis - PGP: 0x13714D33 - http://pgp.mit.edu/
>> Finger Print:   9F67 391B B770 8FF6 99DC  D92D 87F6 2602 1371 4D33
>>
>>
>> --
>> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-general
>
>



-- 
Best Regards,
Aaron Lewis - PGP: 0x13714D33 - http://pgp.mit.edu/
Finger Print:   9F67 391B B770 8FF6 99DC  D92D 87F6 2602 1371 4D33


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Trigram is slow when 10m rows

2016-11-13 Thread Aaron Lewis
Thanks Oleg.

I've increased work_mem to 128MB, now the query falls down to 1.7s,
faster but still not good enough.

Is there any other thing I can do about it?

test=# explain analyze select * from mytable where title ilike 'x264';
  QUERY PLAN
--
 Bitmap Heap Scan on mytable  (cost=462.69..5639.67 rows=1380
width=83) (actual time=1754.656..1754.656 rows=0 loops=1)
   Recheck Cond: (title ~~* 'x264'::text)
   Rows Removed by Index Recheck: 1220793
   Heap Blocks: exact=197567
   ->  Bitmap Index Scan on title_trgm_idx  (cost=0.00..462.35
rows=1380 width=0) (actual time=346.663..346.663 rows=1220793 loops=1)
 Index Cond: (title ~~* 'x264'::text)
 Planning time: 1.168 ms
 Execution time: 1755.944 ms


On Sun, Nov 13, 2016 at 10:04 PM, Oleg Bartunov  wrote:
>
>
> On Sun, Nov 13, 2016 at 2:54 PM, Aaron Lewis 
> wrote:
>>
>> I have a simple table with Trigram index,
>>
>> create table mytable(hash char(40), title text);
>> create index title_trgm_idx on mytable using gin(title gin_trgm_ops);
>>
>> When I run a query with 10m rows, it uses the Trigram index, but takes
>> 3s to execute, very slow.
>> (I have 80m rows, but only inserted 10m for testing purpose)
>>
>> test=# select count(*) from mytable;
>>   count
>> --
>>  13971887
>> (1 row)
>>
>> test=# explain analyze select * from mytable where title ilike 'x264';
>>   QUERY PLAN
>>
>> --
>>  Bitmap Heap Scan on mytable  (cost=462.69..5639.67 rows=1380
>> width=83) (actual time=2937.308..2937.308 rows=0 loops=1)
>>Recheck Cond: (title ~~* 'x264'::text)
>>Rows Removed by Index Recheck: 11402855
>>Heap Blocks: exact=39557 lossy=158010
>>->  Bitmap Index Scan on title_trgm_idx  (cost=0.00..462.35
>> rows=1380 width=0) (actual time=342.440..342.440 rows=1220793 loops=1)
>>  Index Cond: (title ~~* 'x264'::text)
>>  Planning time: 0.611 ms
>>  Execution time: 2937.729 ms
>> (8 rows)
>>
>> Any ideas to speed things up?
>
>
>Rows Removed by Index Recheck: 11402855
>Heap Blocks: exact=39557 lossy=158010
>
> You need to increase work_mem
>>
>>
>> --
>> Best Regards,
>> Aaron Lewis - PGP: 0x13714D33 - http://pgp.mit.edu/
>> Finger Print:   9F67 391B B770 8FF6 99DC  D92D 87F6 2602 1371 4D33
>>
>>
>> --
>> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-general
>
>



-- 
Best Regards,
Aaron Lewis - PGP: 0x13714D33 - http://pgp.mit.edu/
Finger Print:   9F67 391B B770 8FF6 99DC  D92D 87F6 2602 1371 4D33


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Why is this query not using GIN index?

2016-11-13 Thread Oleg Bartunov
On Sun, Nov 13, 2016 at 2:50 PM, Aaron Lewis 
wrote:

> I have a simple table, and a gin index,
>
> create table mytable(hash char(40), title varchar(500));
> create index name_fts on mytable using gin(to_tsvector('english',
> 'title'));
>


^


> create unique index md5_uniq_idx on mytable(hash);
>
> When I execute a query with tsquery, the GIN index was not in use:
>
> test=# explain analyze select * from mytable where
> to_tsvector('english', title) @@ 'abc | def'::tsquery limit 10;
>  QUERY PLAN
> 
> 
>  Limit  (cost=0.00..277.35 rows=10 width=83) (actual
> time=0.111..75.549 rows=10 loops=1)
>->  Seq Scan on mytable  (cost=0.00..381187.45 rows=13744 width=83)
> (actual time=0.110..75.546 rows=10 loops=1)
>  Filter: (to_tsvector('english'::regconfig, (title)::text) @@
> '''abc'' | ''def'''::tsquery)
>  Rows Removed by Filter: 10221
>  Planning time: 0.176 ms
>  Execution time: 75.564 ms
> (6 rows)
>
> Any ideas?
>
>
> --
> Best Regards,
> Aaron Lewis - PGP: 0x13714D33 - http://pgp.mit.edu/
> Finger Print:   9F67 391B B770 8FF6 99DC  D92D 87F6 2602 1371 4D33
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>


Re: [GENERAL] Trigram is slow when 10m rows

2016-11-13 Thread Oleg Bartunov
On Sun, Nov 13, 2016 at 2:54 PM, Aaron Lewis 
wrote:

> I have a simple table with Trigram index,
>
> create table mytable(hash char(40), title text);
> create index title_trgm_idx on mytable using gin(title gin_trgm_ops);
>
> When I run a query with 10m rows, it uses the Trigram index, but takes
> 3s to execute, very slow.
> (I have 80m rows, but only inserted 10m for testing purpose)
>
> test=# select count(*) from mytable;
>   count
> --
>  13971887
> (1 row)
>
> test=# explain analyze select * from mytable where title ilike 'x264';
>   QUERY PLAN
> 
> --
>  Bitmap Heap Scan on mytable  (cost=462.69..5639.67 rows=1380
> width=83) (actual time=2937.308..2937.308 rows=0 loops=1)
>Recheck Cond: (title ~~* 'x264'::text)
>Rows Removed by Index Recheck: 11402855
>Heap Blocks: exact=39557 lossy=158010
>->  Bitmap Index Scan on title_trgm_idx  (cost=0.00..462.35
> rows=1380 width=0) (actual time=342.440..342.440 rows=1220793 loops=1)
>  Index Cond: (title ~~* 'x264'::text)
>  Planning time: 0.611 ms
>  Execution time: 2937.729 ms
> (8 rows)
>
> Any ideas to speed things up?
>

   Rows Removed by Index Recheck: 11402855
   Heap Blocks: exact=39557 lossy=158010

You need to increase work_mem

>
> --
> Best Regards,
> Aaron Lewis - PGP: 0x13714D33 - http://pgp.mit.edu/
> Finger Print:   9F67 391B B770 8FF6 99DC  D92D 87F6 2602 1371 4D33
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>


Re: [GENERAL] pg_dumpall: could not connect to database "template1": FATAL:

2016-11-13 Thread aws backup
Hi,

now I have another problem. Sorry I am an absolute beginner.
When I restore the dumpall backup with 

"psql -f infile postgres"

I get lot of errors > "already exists" and the database is not restored to the 
point of the backup.
I mean after I made the backup I changed something and I expected that this 
change is undone after I restore the backup which did not happen. Looks for me 
like nothing was restored … ?

Thank you for your help.

Regards,
Robert


> On 13 Nov 2016, at 00:37, Scott Marlowe  > wrote:
> 
> On Sat, Nov 12, 2016 at 2:31 PM, Adrian Klaver
> > wrote:
>> On 11/12/2016 01:20 PM, aws backup wrote:
>>> 
>>> Hi,
>>> 
>>> I try to make pg_dumpall backups from a PostgreSQL 9.5 server which is
>>> part of the DaVinci Resolve 12.5.3 App on a Mac OS X 10.11.6 system.
>>> 
>>> Unfortunately I get following failure message:
>>> 
>>> pg_dumpall: could not connect to database "template1": FATAL: password
>>> authentication failed for user "postgres"
>>> 
>>> Maybe you can help me to solve this problem.
>> 
>> 
>> Two choices:
>> 
>> 1) Determine what the password is for the postgres user and provide it when
>> you connect.
>> 
>> 2) If you have access to the pg_hba.conf file create a access line that uses
>> trust as the auth method for user postgres connect that way.
>> 
>> https://www.postgresql.org/docs/9.5/static/auth-methods.html#AUTH-TRUST 
>> 
> 
> OR you could use the -l switch and specify another db.
> 
> pg_dumpall -l mydbnamehere



Re: [GENERAL] pg_dumpall: could not connect to database "template1": FATAL:

2016-11-13 Thread aws backup
Hi Adrian,

thank you for the answer.
There is one password for the postgres database and one for the postgres user.
Both are not working somehow. Is there a way to look up the passwords? I saw in 
the documentation that there is a .pgpass file. But I can't find it.

I changed the auth method to trust for all users. This worked for now.
Thank you.

Best Regards,
Robert


> On 12 Nov 2016, at 23:31, Adrian Klaver  wrote:
> 
> On 11/12/2016 01:20 PM, aws backup wrote:
>> Hi,
>> 
>> I try to make pg_dumpall backups from a PostgreSQL 9.5 server which is part 
>> of the DaVinci Resolve 12.5.3 App on a Mac OS X 10.11.6 system.
>> 
>> Unfortunately I get following failure message:
>> 
>> pg_dumpall: could not connect to database "template1": FATAL: password 
>> authentication failed for user "postgres"
>> 
>> Maybe you can help me to solve this problem.
> 
> Two choices:
> 
> 1) Determine what the password is for the postgres user and provide it when 
> you connect.
> 
> 2) If you have access to the pg_hba.conf file create a access line that uses 
> trust as the auth method for user postgres connect that way.
> 
> https://www.postgresql.org/docs/9.5/static/auth-methods.html#AUTH-TRUST
> 
>> 
>> Thank you.
>> 
>> Best Regards,
>> Robert
>> 
>> 
>> 
>> 
> 
> 
> -- 
> Adrian Klaver
> adrian.kla...@aklaver.com
> 
> 
> -- 
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Trigram is slow when 10m rows

2016-11-13 Thread Aaron Lewis
I have a simple table with Trigram index,

create table mytable(hash char(40), title text);
create index title_trgm_idx on mytable using gin(title gin_trgm_ops);

When I run a query with 10m rows, it uses the Trigram index, but takes
3s to execute, very slow.
(I have 80m rows, but only inserted 10m for testing purpose)

test=# select count(*) from mytable;
  count
--
 13971887
(1 row)

test=# explain analyze select * from mytable where title ilike 'x264';
  QUERY PLAN
--
 Bitmap Heap Scan on mytable  (cost=462.69..5639.67 rows=1380
width=83) (actual time=2937.308..2937.308 rows=0 loops=1)
   Recheck Cond: (title ~~* 'x264'::text)
   Rows Removed by Index Recheck: 11402855
   Heap Blocks: exact=39557 lossy=158010
   ->  Bitmap Index Scan on title_trgm_idx  (cost=0.00..462.35
rows=1380 width=0) (actual time=342.440..342.440 rows=1220793 loops=1)
 Index Cond: (title ~~* 'x264'::text)
 Planning time: 0.611 ms
 Execution time: 2937.729 ms
(8 rows)

Any ideas to speed things up?

-- 
Best Regards,
Aaron Lewis - PGP: 0x13714D33 - http://pgp.mit.edu/
Finger Print:   9F67 391B B770 8FF6 99DC  D92D 87F6 2602 1371 4D33


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Why is this query not using GIN index?

2016-11-13 Thread Aaron Lewis
I have a simple table, and a gin index,

create table mytable(hash char(40), title varchar(500));
create index name_fts on mytable using gin(to_tsvector('english', 'title'));
create unique index md5_uniq_idx on mytable(hash);

When I execute a query with tsquery, the GIN index was not in use:

test=# explain analyze select * from mytable where
to_tsvector('english', title) @@ 'abc | def'::tsquery limit 10;
 QUERY PLAN

 Limit  (cost=0.00..277.35 rows=10 width=83) (actual
time=0.111..75.549 rows=10 loops=1)
   ->  Seq Scan on mytable  (cost=0.00..381187.45 rows=13744 width=83)
(actual time=0.110..75.546 rows=10 loops=1)
 Filter: (to_tsvector('english'::regconfig, (title)::text) @@
'''abc'' | ''def'''::tsquery)
 Rows Removed by Filter: 10221
 Planning time: 0.176 ms
 Execution time: 75.564 ms
(6 rows)

Any ideas?


-- 
Best Regards,
Aaron Lewis - PGP: 0x13714D33 - http://pgp.mit.edu/
Finger Print:   9F67 391B B770 8FF6 99DC  D92D 87F6 2602 1371 4D33


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general