Re: [GENERAL] PostgreSQL Database performance

2016-09-06 Thread Ilya Kazakevich
Hi.

 

"shared_buffers" should be set to 30-40% of your system RAM.

This param controls how much memory database may use.

 

Please see
https://www.postgresql.org/docs/9.1/static/runtime-config-resource.html

 

 

 

Ilya Kazakevich

 

JetBrains

 <http://www.jetbrains.com/> http://www.jetbrains.com

The Drive to Develop

 

From: pgsql-general-ow...@postgresql.org
[mailto:pgsql-general-ow...@postgresql.org] On Behalf Of Pradeep
Sent: Saturday, September 03, 2016 6:39 AM
To: pgsql-general@postgresql.org
Subject: [GENERAL] PostgreSQL Database performance 



Re: [GENERAL] Understanding Postgres Memory Usage

2016-08-25 Thread Ilya Kazakevich
$ free -h  # Before the query

 total   used   free sharedbuffers cached

Mem:  7.8G   5.2G   2.6G   212M90M   4.9G

-/+ buffers/cache:   248M   7.6G

Swap:   0B 0B 0B

$ free -h  # After the query

 total   used   free sharedbuffers cached

Mem:  7.8G   5.3G   2.5G   212M90M   4.9G

-/+ buffers/cache:   312M   7.5G

Swap:   0B 0B 0B

 

[I.K >> ] Are you talking about buffers/cache increased? AFAIK this memory is 
used by kernel as buffer before any block device (HDD for example). 

Postgres does not use this memory directly, it simply reads data from block 
device, and kernel caches it. Process can’t be OOMed because of it.

 

 

I am sure you should configure your Postgres to NEVER exceed available RAM. You 
may use tools like (http://pgtune.leopard.in.ua/) or calculate it manually.

I do not remember exact formula, but it should be something like 
“work_mem*max_connections + shared_buffers” and it should be around 80% of your 
machine RAM (minus RAM used by other processes and kernel).

It will save you from OOM.

 

If you  face performance bottleneck after it, you fix it using tools like 
“log_min_duration_statement”, “track_io_timing” and system-provided tools.

 

 

 

 

Ilya Kazakevich

 

JetBrains

 <http://www.jetbrains.com/> http://www.jetbrains.com

The Drive to Develop

 

 



Re: [GENERAL] pg_hba.conf : bad entry for ADDRESS

2016-08-25 Thread Ilya Kazakevich

% psql --host=127.0.0.1/32 --dbname=mattermost --username=mmuser

  
psql: could not translate host name "127.0.0.1/32" to address: Name or service 
not known
% psql --host=127.0.0.1/24 --dbname=mattermost --username=mmuser

psql: could not translate host name "127.0.0.1/24" to address: Name or service 
not known


[I.K >> ] “127.0.0.1/32” is network that includes only “127.0.0.1” while 
“127.0.0.1” is address. 

You specify network in .conf file and use address as argument to psql



Re: [GENERAL] pg_hba.conf : bad entry for ADDRESS

2016-08-25 Thread Ilya Kazakevich
>How can I verify ?
Can you connect as postgres (superuser)?
If yes, connect and type "show hba_file;"
If no, try adding "local all postgres peer" or even "local all postgres trust" 
to this file and restart postgres. Check again.




-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] pg_hba.conf : bad entry for ADDRESS

2016-08-25 Thread Ilya Kazakevich
>I entered this line in pg_hab.conf:
Are you sure your file name is correct and it is really used by postgres? 


Ilya Kazakevich

JetBrains
http://www.jetbrains.com
The Drive to Develop



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] question on error during COPY FROM

2016-08-23 Thread Ilya Kazakevich
Hi.

>does that mean that I should always execute a VACUUM to recover the
>wasted space when an error is triggered or will the auto-vacuum mechanism
>do the job by itself ?


If you have autovacuum enabled it will clean up tablespace. However, space will 
not be returned to filesystem but will be reused by database.
You may run VACUUM FULL manually to return it to filesystem.

https://www.postgresql.org/docs/9.1/static/routine-vacuuming.html


Ilya Kazakevich

JetBrains
http://www.jetbrains.com
The Drive to Develop



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Sequential vs. random values - number of pages in B-tree

2016-08-18 Thread Ilya Kazakevich
>Thank you. So if that is the reason changing the fillfactor parameter should
>help?

Fillfactor is not about rebalancing, but about page split. If you have many 
insertions you may decrease fillfactor to minimize  page splits, but I am not 
sure it will help in your case.  But you should try)
Better approach is to create index _after_ insertion, but it is not always 
possible.


Ilya Kazakevich

JetBrains
http://www.jetbrains.com
The Drive to Develop



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Sequential vs. random values - number of pages in B-tree

2016-08-18 Thread Ilya Kazakevich
Hi, 

>What's the reason that postgres needs more index pages to store random
>data
>than sequential ones?

I assume that is because B-Tree is self-balanced tree, so it needs to be
rebalanced after each insertion.
Random insertions may go to the head of index where no space left leading to
huge data moving.
https://en.wikipedia.org/wiki/B-tree#Insertions_and_deletions



Ilya Kazakevich

JetBrains
http://www.jetbrains.com
The Drive to Develop



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Permissions pg_dump / import

2016-08-17 Thread Ilya Kazakevich
>> Owned by: public.accounts.id 

This is  not owner but table this sequence depends on. See 
http://stackoverflow.com/questions/6941043/get-table-and-column-owning-a-sequence

 

Use query provided on SO to get real owner

 

Ilya Kazakevich

 

JetBrains

http://www.jetbrains.com <http://www.jetbrains.com/> 

The Drive to Develop

 

From: pgsql-general-ow...@postgresql.org 
[mailto:pgsql-general-ow...@postgresql.org] On Behalf Of Patrick B
Sent: Thursday, August 18, 2016 1:17 AM
To: pgsql-general
Subject: [GENERAL] Permissions pg_dump / import

 

Hi guys,

 

I'm running a pg_dump and then importing the dump into a test server. I'm using 
PostgreSQL 9.5.

 

pg_dump:

pg_dump --dbname=prod1 --host= servername --port=5432 --username=user1 -v -f 
test1_NEW.sql

Steps into the new database (test1):

CREATE SCHEMA public;
GRANT ALL ON SCHEMA public TO user1;
psql -h servername -U master --port=5432 --dbname=test1 -f test1_NEW.sql


I get lots of errors like:

psql:test1_NEW.sql:212: ERROR:  must be owner of relation accounts_id_seq

 

prod1=> \d+ accounts_id_seq
Sequence "public.accounts_id_seq"
Column |  Type   |Value| Storage 
---+-+-+-
 sequence_name | name| accounts_id_seq | plain
 last_value| bigint  | 33  | plain
 start_value   | bigint  | 1   | plain
 increment_by  | bigint  | 1   | plain
 max_value | bigint  | 9223372036854775807 | plain
 min_value | bigint  | 1   | plain
 cache_value   | bigint  | 1   | plain
 log_cnt   | bigint  | 32  | plain
 is_cycled | boolean | f   | plain
 is_called | boolean | t   | plain
Owned by: public.accounts.id 

 

What do I have to do? Should I revoke the permissions on the prod1 database 
before performing the dump?

 

Cheers;

Patrick 



Re: [GENERAL] schema advice for event stream with tagging and filtering

2016-08-16 Thread Ilya Kazakevich
>>> An event is a row with a primary key along the lines of (colo, host,
>>> category) and an associated set of tags, where each tag has a type
>>> and a value
>>> (eg: {"color": "red", "owner": "fred", "status": "open"...}).
>>
>> What about  simple table with several columns and hstore  field for tags?

BTW, "pure SQL" approach here is to use separate tables: Tags(TagId, TagName) 
and TagValues(EventId,TagId,Value). 
But in this case it will be painful to filter events by tag values directly, so 
only separate denormalized OLAP table should be used in "pure SQL":)
PostgreSQL, however, supports key-value based hstore.

>> You may also normalize it (move hosts and categories to separate table).
>
>Why? These form part of the primary key for the event...

Host and category could be part of PK, but it may be better to have "HostId" 
field and "Hosts(HostId, Host)" table than "text" field with many "www.foo.bar" 
values) 

>What kind of index is recommended here? The kind of queries would be:
>
>- show me a list of tag types and the count of the number of events of that
>type
>
>- show me all events that have tag1=x, tag2=y and does not have tag3

Hstore supports GIN and GIST 
(https://www.postgresql.org/docs/9.1/static/textsearch-indexes.html). I'd start 
with  GIN.

Do you need real-time data or, say, one-day old data is ok? In latter case it 
is better to load data to denormalized table to speed-up queries and use no 
index on normalized(OLTP) table because index update operation is not free)


Ilya Kazakevich

JetBrains
http://www.jetbrains.com
The Drive to Develop



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] schema advice for event stream with tagging and filtering

2016-08-16 Thread Ilya Kazakevich
Hi,

>An event is a row with a primary key along the lines of (colo, host,
>category) and an associated set of tags, where each tag has a type and a value
>(eg: {"color": "red", "owner": "fred", "status": "open"...}).

What about  simple table with several columns and hstore  field for tags?
You may also normalize it (move hosts and categories to separate table).

indexes should help you with fast filtering, or you may load data from this 
table to denormalized olap table once a day and build index there to speed-up 
queries.

Ilya Kazakevich

JetBrains
http://www.jetbrains.com
The Drive to Develop




-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Looking for software to 'enqueue' connections

2016-08-15 Thread Ilya Kazakevich
Hello.

 

From:

http://www.pgpool.net/

pgpool-II also has a limit on the maximum number of connections, but extra 
connections will be queued instead of returning an error immediately.

 

But your configuration does not look optimal for me. Here are some things you 
may try:

1)  Get rid of indexes. Use this table as OLTP, then denormalize data and 
load it to OLAP table, build indecies and analyze it.

2)  Find bottleneck using your OS tools (is it I/O or CPU?) and improve 
appropriate subsystem)

3)  Use several servers (multimaster configuration like 
https://wiki.postgresql.org/wiki/Bucardo) 

 

Ilya Kazakevich

 

JetBrains

 <http://www.jetbrains.com/> http://www.jetbrains.com

The Drive to Develop

 

From: pgsql-general-ow...@postgresql.org 
[mailto:pgsql-general-ow...@postgresql.org] On Behalf Of Edmundo Robles
Sent: Monday, August 15, 2016 11:30 PM
To: pgsql-general@postgresql.org
Subject: [GENERAL] Looking for software to 'enqueue' connections

 

Hi!  

I want find  a software to 'enqueue' the client connections to database, so if 
i reach the max limit the query must be holding in a queue   until one 
connection is released.

 

I have  many devices (100+) saving  their state to a database,  each minute,  
but  the table is too large more than 13,000,000 of records and many indexes, 
so, insert  one record takes 3 or more minutes.

 

Then,  there is a moment  at connection limit is reached :( and  lose 
information  

 

I tried with pgbouncer  to  'enqueue' the connections but  I  get  no success, 
maybe   I missing something...

 

by the way: 

I use postgres 9.4 with max_connections 100

and pgbouncer  max_connections to 100 and  reserve_pool_size=50

 

I hope you  can help me... 

 

 thanks.