Re: [GENERAL] Book or other resource on Postgres-local code?

2016-12-29 Thread Pavel Stehule
2016-12-30 8:04 GMT+01:00 Guyren Howe :

>
> > On Dec 29, 2016, at 23:01 , Regina Obe  wrote:
> >
> >
> >> As an aside from my last question about my LYDB effort:
> >
> >> https://medium.com/@gisborne/love-your-database-lydb-
> 23c69f480a1d#.4jngp2rcb
> >
> >> I would like to find a book or other resource about SQL server-side
> programming (stored procedures etc) best practices in general and for
> Postgres in particular.
> >
> > Shameless plug
> >
> > Have you checked out our book?  The 2nd edition covered PostgreSQL 9.2 -
> 9.4.
> > http://shop.oreilly.com/product/0636920052715.do
> >
> > We are working on the 3rd edition which is a bit fatter (probably will
> be about 50 pages fatter when we are done) than the 2nd edition.
> >
> > http://shop.oreilly.com/product/0636920052715.do
> >
> > The 3rd focuses on PostgreSQL 9.5-9.6 (and is in prerelease sale at
> moment).  By the time of release, we'll probably have some PostgreSQL 10
> content in there as well.
> >
> > It covers fancy SQL constructs and data types (both ANSI ones and ones
> unique to PostgreSQL), general administration, and writing stored functions
> with SQL, PLPGSQL and PL/V8.
> >
> > In the 3rd we are adding an additional PL/V8 example how to build a
> window function in PL/V8 (aka PL/JavaScript)
>
> I’m sure the book is great. But it looks like much of the material I can
> find about Postgres: how to write a function, how to write a query, etc.
>
> What I’m more looking for is “System Design with Postgres”: *when* to
> write a function, *when* to use a stored procedure over a client-side
> function.
>

Lot of Oracle's books related to this topic is valid for PostgreSQL too.
The design of stored procedures in PostgreSQL is conceptually similar to
Oracle.

The theme "stored procedures" is strongly controversial -  from "stored
procedures are evil" to "do all in procedures".

I like the strategy - what you can do easy in database, do it there - the
client should to get a results. But don't do communication server from
database. PostgreSQL is ACID, stored procedures are ACID. Outer world is
not ACID - and interface ACID/NOACID is better to implement outside
database.

Regards

Pavel



> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>


Re: [GENERAL] Book or other resource on Postgres-local code?

2016-12-29 Thread Guyren Howe

> On Dec 29, 2016, at 23:01 , Regina Obe  wrote:
> 
> 
>> As an aside from my last question about my LYDB effort:
> 
>> https://medium.com/@gisborne/love-your-database-lydb-23c69f480a1d#.4jngp2rcb
> 
>> I would like to find a book or other resource about SQL server-side 
>> programming (stored procedures etc) best practices in general and for 
>> Postgres in particular.
> 
> Shameless plug
> 
> Have you checked out our book?  The 2nd edition covered PostgreSQL 9.2 - 9.4.
> http://shop.oreilly.com/product/0636920052715.do
> 
> We are working on the 3rd edition which is a bit fatter (probably will be 
> about 50 pages fatter when we are done) than the 2nd edition.  
> 
> http://shop.oreilly.com/product/0636920052715.do
> 
> The 3rd focuses on PostgreSQL 9.5-9.6 (and is in prerelease sale at moment).  
> By the time of release, we'll probably have some PostgreSQL 10 content in 
> there as well.
> 
> It covers fancy SQL constructs and data types (both ANSI ones and ones unique 
> to PostgreSQL), general administration, and writing stored functions with 
> SQL, PLPGSQL and PL/V8.
> 
> In the 3rd we are adding an additional PL/V8 example how to build a window 
> function in PL/V8 (aka PL/JavaScript)

I’m sure the book is great. But it looks like much of the material I can find 
about Postgres: how to write a function, how to write a query, etc.

What I’m more looking for is “System Design with Postgres”: *when* to write a 
function, *when* to use a stored procedure over a client-side function.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Book or other resource on Postgres-local code?

2016-12-29 Thread Regina Obe

> As an aside from my last question about my LYDB effort:

> https://medium.com/@gisborne/love-your-database-lydb-23c69f480a1d#.4jngp2rcb

> I would like to find a book or other resource about SQL server-side 
> programming (stored procedures etc) best practices in general and for 
> Postgres in particular.

> I found this book quite helpful (I have the first edition, but I assume that 
> the second is even better):

> https://www.packtpub.com/big-data-and-business-intelligence/postgresql-server-programming-second-edition/?utm_source=PoD_medium=referral_campaign=1783980583

> Regards
> Charles


> Seems like a thing that should exist. I can find a variety of blog posts, 
> mostly about Oracle and MS SQL Server, but nothing that appears particularly 
> canonical or “classic”.

> Is there such a resource?

Shameless plug

Have you checked out our book?  The 2nd edition covered PostgreSQL 9.2 - 9.4.
http://shop.oreilly.com/product/0636920052715.do

We are working on the 3rd edition which is a bit fatter (probably will be about 
50 pages fatter when we are done) than the 2nd edition.  

http://shop.oreilly.com/product/0636920052715.do

The 3rd focuses on PostgreSQL 9.5-9.6 (and is in prerelease sale at moment).  
By the time of release, we'll probably have some PostgreSQL 10 content in there 
as well.

It covers fancy SQL constructs and data types (both ANSI ones and ones unique 
to PostgreSQL), general administration, and writing stored functions with SQL, 
PLPGSQL and PL/V8.

In the 3rd we are adding an additional PL/V8 example how to build a window 
function in PL/V8 (aka PL/JavaScript)

Thanks,
Regina




-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] performance tuning postgresql 9.5.5.10 [enterprisedb]

2016-12-29 Thread ajmcello
Reducing worker mem shaved about 12 minutes off the query time.. Thanks for the 
suggestion. I lowered it to 10MB instead of 100MB



> On Dec 29, 2016, at 8:07 PM, Charles Clavadetscher 
>  wrote:
> 
> Forwarding to list.
> 
> -Original Message-
> From: ajmcello [mailto:ajmcell...@gmail.com] 
> Sent: Freitag, 30. Dezember 2016 07:05
> To: Charles Clavadetscher 
> Subject: Re: [GENERAL] performance tuning postgresql 9.5.5.10 [enterprisedb]
> 
> There are no connections except one cli when running the query. After that 
> finishes then I get connection refused  or cannot connect
> to server due to load increasing because of server connections. But I'm more 
> interested in tuning the server for better query
> response time. Is there anything in the configuration that would help?
> 
> Sent from my iPhone
> 
>> On Dec 29, 2016, at 7:35 PM, Charles Clavadetscher 
>>  wrote:
>> 
>> Hello
>> 
>>> -Original Message-
>>> From: pgsql-general-ow...@postgresql.org 
>>> [mailto:pgsql-general-ow...@postgresql.org] On Behalf Of ajmcello
>>> Sent: Freitag, 30. Dezember 2016 05:54
>>> To: POSTGRES 
>>> Subject: [GENERAL] performance tuning postgresql 9.5.5.10 [enterprisedb]
>>> 
>>> 
>>> I am trying to optimize and tune my server for fastest simple queries with 
>>> highest connection to server possible.
>>> Basically, a SELECT item from table takes 30 minutes on a machine with SSD 
>>> drives. The table has 900K entries and 12
>>> columns.  Using that SELECT query, I then have the ability to make about 
>>> 500 simultaneous connections to the server
>>> before errors start to occur. So, I can live with 500, but the slow query 
>>> gives me grief.
>> 
>> From previous posts of other users, I assume that in order to get help you 
>> will need to provide some more information. Here the
> questions that come to my mind.
>> 
>> What errors do you get from the server when you reach the 500 connections?
>> 
>> How long does it take to run the query without heavy load, e.g. just one 
>> user connected?
>> 
>> \timing on
>> query
>> 
>> How does the execution plan look like? There you may see if the index is 
>> used at all.
>> 
>> EXPLAIN ANALYZE query;
>> 
>>> I have a GIST index for the table. I tried btree, but according to 
>>> performance tests, GIST was faster. So I went
>>> with GIST.
>>> 
>>> The system drive and pgsql drive are separate. I can separate them further 
>>> if need to be. Total ram is 56GB. I added
>>> 32GB of swap.
>>> 
>>> Is there anything I can do to speed up the SELECT statement?
>> 
>> Could you provide the statement and the table definition?
>> 
>>> Here is what I have:
>>> 
>>> [sysctl.conf]
>>> net.ipv4.conf.default.rp_filter=1
>>> net.ipv4.conf.all.rp_filter=1
>>> net.ipv4.tcp_syncookies=1
>>> net.ipv4.ip_forward=1
>>> net.ipv6.conf.all.forwarding=1
>>> net.ipv4.conf.all.accept_redirects=0
>>> net.ipv6.conf.all.accept_redirects=0
>>> net.ipv4.conf.all.accept_source_route=0
>>> net.ipv6.conf.all.accept_source_route=0
>>> net.ipv4.conf.all.log_martians=1
>>> kernel.sysrq=0
>>> kernel.shmmax=214748399
>>> kernel.shmall=209715999
>>> #32GBkernel.shmmax=17179869184
>>> #32GBkernel.shmall=4194304
>>> kernel.shmmni=9
>>> kernel.shmmin=1
>>> kernel.shmseg=10
>>> semmsl, semmns, semopm, semmni kernel.sem=250 32000 100 128
>>> fs.file-max=65536
>>> kern.maxfiles=5
>>> kern.maxfilesperproc=5
>>> net.ipv4.ip_local_port_range=1024 65535
>>> net.ipv4.tcp_tw_recycle=1
>>> net.ipv4.tcp_fin_timeout=10
>>> net.ipv4.tcp_tw_reuse=1
>>> net.core.rmem_max=16777216
>>> net.core.wmem_max=16777216
>>> net.ipv4.tcp_max_syn_backlog=4096
>>> net.ipv4.tcp_syncookies=1
>>> kernel.sched_migration_cost_ns=500
>>> kernel.sched_migration_cost_ns=500
>>> kernel.sched_autogroup_enabled=0
>>> vm.swappiness=10
>>> 
>>> 
>> 
>> Here are some helpful informations on the settings below:
>> https://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server
>> 
>> I am not really the big expert, but 100'000 max_connections and work_mem of 
>> 100MB seems to me to be a problem:
>> 
>> From the link mentioned right above:
>> 
>> "This size (work_mem) is applied to each and every sort done by each user, 
>> and complex queries can use multiple working memory
> sort buffers. Set it to 50MB, and have 30 users submitting queries, and you 
> are soon using 1.5GB of real memory."
>> 
>> This is:
>> SELECT * FROM pg_size_pretty((50.0*(2^20)*30.0)::BIGINT);
>> -[ RECORD 1 ]--+
>> pg_size_pretty | 1500 MB
>> 
>> Applied to your settings:
>> 
>> SELECT * FROM pg_size_pretty((100.0*(2^20)*10.0)::BIGINT);
>> -[ RECORD 1 ]--+
>> pg_size_pretty | 9766 GB
>> 
>> This could explain the errors you get from the server. You may be trying to 
>> use much more memory than you have.
>> 
>> Regards
>> Charles
>> 
>>> [postgresql.conf]
>>> max_connections = 10
>>> max_files_per_process 

FW: [GENERAL] performance tuning postgresql 9.5.5.10 [enterprisedb]

2016-12-29 Thread Charles Clavadetscher
Forwarding to list.

-Original Message-
From: ajmcello [mailto:ajmcell...@gmail.com] 
Sent: Freitag, 30. Dezember 2016 07:05
To: Charles Clavadetscher 
Subject: Re: [GENERAL] performance tuning postgresql 9.5.5.10 [enterprisedb]

There are no connections except one cli when running the query. After that 
finishes then I get connection refused  or cannot connect
to server due to load increasing because of server connections. But I'm more 
interested in tuning the server for better query
response time. Is there anything in the configuration that would help?

Sent from my iPhone

> On Dec 29, 2016, at 7:35 PM, Charles Clavadetscher 
>  wrote:
> 
> Hello
> 
>> -Original Message-
>> From: pgsql-general-ow...@postgresql.org 
>> [mailto:pgsql-general-ow...@postgresql.org] On Behalf Of ajmcello
>> Sent: Freitag, 30. Dezember 2016 05:54
>> To: POSTGRES 
>> Subject: [GENERAL] performance tuning postgresql 9.5.5.10 [enterprisedb]
>> 
>> 
>> I am trying to optimize and tune my server for fastest simple queries with 
>> highest connection to server possible.
>> Basically, a SELECT item from table takes 30 minutes on a machine with SSD 
>> drives. The table has 900K entries and 12
>> columns.  Using that SELECT query, I then have the ability to make about 500 
>> simultaneous connections to the server
>> before errors start to occur. So, I can live with 500, but the slow query 
>> gives me grief.
> 
> From previous posts of other users, I assume that in order to get help you 
> will need to provide some more information. Here the
questions that come to my mind.
> 
> What errors do you get from the server when you reach the 500 connections?
> 
> How long does it take to run the query without heavy load, e.g. just one user 
> connected?
> 
> \timing on
> query
> 
> How does the execution plan look like? There you may see if the index is used 
> at all.
> 
> EXPLAIN ANALYZE query;
> 
>> I have a GIST index for the table. I tried btree, but according to 
>> performance tests, GIST was faster. So I went
>> with GIST.
>> 
>> The system drive and pgsql drive are separate. I can separate them further 
>> if need to be. Total ram is 56GB. I added
>> 32GB of swap.
>> 
>> Is there anything I can do to speed up the SELECT statement?
> 
> Could you provide the statement and the table definition?
> 
>> Here is what I have:
>> 
>> [sysctl.conf]
>> net.ipv4.conf.default.rp_filter=1
>> net.ipv4.conf.all.rp_filter=1
>> net.ipv4.tcp_syncookies=1
>> net.ipv4.ip_forward=1
>> net.ipv6.conf.all.forwarding=1
>> net.ipv4.conf.all.accept_redirects=0
>> net.ipv6.conf.all.accept_redirects=0
>> net.ipv4.conf.all.accept_source_route=0
>> net.ipv6.conf.all.accept_source_route=0
>> net.ipv4.conf.all.log_martians=1
>> kernel.sysrq=0
>> kernel.shmmax=214748399
>> kernel.shmall=209715999
>> #32GBkernel.shmmax=17179869184
>> #32GBkernel.shmall=4194304
>> kernel.shmmni=9
>> kernel.shmmin=1
>> kernel.shmseg=10
>> semmsl, semmns, semopm, semmni kernel.sem=250 32000 100 128
>> fs.file-max=65536
>> kern.maxfiles=5
>> kern.maxfilesperproc=5
>> net.ipv4.ip_local_port_range=1024 65535
>> net.ipv4.tcp_tw_recycle=1
>> net.ipv4.tcp_fin_timeout=10
>> net.ipv4.tcp_tw_reuse=1
>> net.core.rmem_max=16777216
>> net.core.wmem_max=16777216
>> net.ipv4.tcp_max_syn_backlog=4096
>> net.ipv4.tcp_syncookies=1
>> kernel.sched_migration_cost_ns=500
>> kernel.sched_migration_cost_ns=500
>> kernel.sched_autogroup_enabled=0
>> vm.swappiness=10
>> 
>> 
> 
> Here are some helpful informations on the settings below:
> https://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server
> 
> I am not really the big expert, but 100'000 max_connections and work_mem of 
> 100MB seems to me to be a problem:
> 
> From the link mentioned right above:
> 
> "This size (work_mem) is applied to each and every sort done by each user, 
> and complex queries can use multiple working memory
sort buffers. Set it to 50MB, and have 30 users submitting queries, and you are 
soon using 1.5GB of real memory."
> 
> This is:
> SELECT * FROM pg_size_pretty((50.0*(2^20)*30.0)::BIGINT);
> -[ RECORD 1 ]--+
> pg_size_pretty | 1500 MB
> 
> Applied to your settings:
> 
> SELECT * FROM pg_size_pretty((100.0*(2^20)*10.0)::BIGINT);
> -[ RECORD 1 ]--+
> pg_size_pretty | 9766 GB
> 
> This could explain the errors you get from the server. You may be trying to 
> use much more memory than you have.
> 
> Regards
> Charles
> 
>> [postgresql.conf]
>> max_connections = 10
>> max_files_per_process = 100
>> shared_buffers = 24GB
>> max_locks_per_transaction  = 1000
>> effective_cache_size = 50GB
>> work_mem = 100MB
>> maintenance_work_mem = 2GB
>> log_min_duration_statement = 1
>> checkpoint_completion_target = 0.9
>> wal_buffers = 32MB
>> default_statistics_target = 100
>> listen_addresses = '*'
>> port = 5432
>> ssl = off
>> wal_sync_method = fdatasync
>> 

Re: [GENERAL] performance tuning postgresql 9.5.5.10 [enterprisedb]

2016-12-29 Thread Charles Clavadetscher
Hello

> -Original Message-
> From: pgsql-general-ow...@postgresql.org 
> [mailto:pgsql-general-ow...@postgresql.org] On Behalf Of ajmcello
> Sent: Freitag, 30. Dezember 2016 05:54
> To: POSTGRES 
> Subject: [GENERAL] performance tuning postgresql 9.5.5.10 [enterprisedb]
> 
> 
> I am trying to optimize and tune my server for fastest simple queries with 
> highest connection to server possible.
> Basically, a SELECT item from table takes 30 minutes on a machine with SSD 
> drives. The table has 900K entries and 12
> columns.  Using that SELECT query, I then have the ability to make about 500 
> simultaneous connections to the server
> before errors start to occur. So, I can live with 500, but the slow query 
> gives me grief.

>From previous posts of other users, I assume that in order to get help you 
>will need to provide some more information. Here the questions that come to my 
>mind.

What errors do you get from the server when you reach the 500 connections?

How long does it take to run the query without heavy load, e.g. just one user 
connected?

\timing on
query

How does the execution plan look like? There you may see if the index is used 
at all.

EXPLAIN ANALYZE query;

> I have a GIST index for the table. I tried btree, but according to 
> performance tests, GIST was faster. So I went
> with GIST.
> 
> The system drive and pgsql drive are separate. I can separate them further if 
> need to be. Total ram is 56GB. I added
> 32GB of swap.
> 
> Is there anything I can do to speed up the SELECT statement?

Could you provide the statement and the table definition?

> Here is what I have:
> 
> [sysctl.conf]
> net.ipv4.conf.default.rp_filter=1
> net.ipv4.conf.all.rp_filter=1
> net.ipv4.tcp_syncookies=1
> net.ipv4.ip_forward=1
> net.ipv6.conf.all.forwarding=1
> net.ipv4.conf.all.accept_redirects=0
> net.ipv6.conf.all.accept_redirects=0
> net.ipv4.conf.all.accept_source_route=0
> net.ipv6.conf.all.accept_source_route=0
> net.ipv4.conf.all.log_martians=1
> kernel.sysrq=0
> kernel.shmmax=214748399
> kernel.shmall=209715999
> #32GBkernel.shmmax=17179869184
> #32GBkernel.shmall=4194304
> kernel.shmmni=9
> kernel.shmmin=1
> kernel.shmseg=10
> semmsl, semmns, semopm, semmni kernel.sem=250 32000 100 128
> fs.file-max=65536
> kern.maxfiles=5
> kern.maxfilesperproc=5
> net.ipv4.ip_local_port_range=1024 65535
> net.ipv4.tcp_tw_recycle=1
> net.ipv4.tcp_fin_timeout=10
> net.ipv4.tcp_tw_reuse=1
> net.core.rmem_max=16777216
> net.core.wmem_max=16777216
> net.ipv4.tcp_max_syn_backlog=4096
> net.ipv4.tcp_syncookies=1
> kernel.sched_migration_cost_ns=500
> kernel.sched_migration_cost_ns=500
> kernel.sched_autogroup_enabled=0
> vm.swappiness=10
> 
> 

Here are some helpful informations on the settings below:
https://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server

I am not really the big expert, but 100'000 max_connections and work_mem of 
100MB seems to me to be a problem:

From the link mentioned right above:

"This size (work_mem) is applied to each and every sort done by each user, and 
complex queries can use multiple working memory sort buffers. Set it to 50MB, 
and have 30 users submitting queries, and you are soon using 1.5GB of real 
memory."

This is:
SELECT * FROM pg_size_pretty((50.0*(2^20)*30.0)::BIGINT);
-[ RECORD 1 ]--+
pg_size_pretty | 1500 MB

Applied to your settings:

SELECT * FROM pg_size_pretty((100.0*(2^20)*10.0)::BIGINT);
-[ RECORD 1 ]--+
pg_size_pretty | 9766 GB

This could explain the errors you get from the server. You may be trying to use 
much more memory than you have.

Regards
Charles

> [postgresql.conf]
> max_connections = 10
> max_files_per_process = 100
> shared_buffers = 24GB
> max_locks_per_transaction  = 1000
> effective_cache_size = 50GB
> work_mem = 100MB
> maintenance_work_mem = 2GB
> log_min_duration_statement = 1
> checkpoint_completion_target = 0.9
> wal_buffers = 32MB
> default_statistics_target = 100
> listen_addresses = '*'
> port = 5432
> ssl = off
> wal_sync_method = fdatasync
> synchronous_commit = on
> fsync = off
> wal_level = minimal
> #client_min_messages = fatal
> #log_min_messages = fatal
> #log_min_error_statement = fatal
> datestyle = 'iso, mdy'
> debug_pretty_print = off
> debug_print_parse = off
> debug_print_plan = off
> debug_print_rewritten = off
> default_text_search_config = 'pg_catalog.english'
> enable_bitmapscan = on
> enable_hashagg = on
> enable_hashjoin = on
> enable_indexonlyscan = on
> enable_indexscan = on
> enable_material = on
> enable_mergejoin = on
> enable_nestloop = on
> enable_seqscan = on
> enable_sort = on
> enable_tidscan = on
> from_collapse_limit = 8
> geqo = on
> geqo_threshold = 12
> log_checkpoints = off
> 
> log_connections = off
> log_disconnections = off
> log_duration = off
> log_executor_stats = off
> log_hostname = off
> log_parser_stats = off
> log_planner_stats = off
> log_replication_commands = off
> log_statement_stats 

[GENERAL] performance tuning postgresql 9.5.5.10 [enterprisedb]

2016-12-29 Thread ajmcello
I am trying to optimize and tune my server for fastest simple queries with
highest connection to server possible. Basically, a SELECT item from table
takes 30 minutes on a machine with SSD drives. The table has 900K entries
and 12 columns.  Using that SELECT query, I then have the ability to make
about 500 simultaneous connections to the server before errors start to
occur. So, I can live with 500, but the slow query gives me grief.

I have a GIST index for the table. I tried btree, but according to
performance tests, GIST was faster. So I went with GIST.

The system drive and pgsql drive are separate. I can separate them further
if need to be. Total ram is 56GB. I added 32GB of swap.

Is there anything I can do to speed up the SELECT statement?

Here is what I have:

[sysctl.conf]
net.ipv4.conf.default.rp_filter=1
net.ipv4.conf.all.rp_filter=1
net.ipv4.tcp_syncookies=1
net.ipv4.ip_forward=1
net.ipv6.conf.all.forwarding=1
net.ipv4.conf.all.accept_redirects=0
net.ipv6.conf.all.accept_redirects=0
net.ipv4.conf.all.accept_source_route=0
net.ipv6.conf.all.accept_source_route=0
net.ipv4.conf.all.log_martians=1
kernel.sysrq=0
kernel.shmmax=214748399
kernel.shmall=209715999
#32GBkernel.shmmax=17179869184
#32GBkernel.shmall=4194304
kernel.shmmni=9
kernel.shmmin=1
kernel.shmseg=10
semmsl, semmns, semopm, semmni kernel.sem=250 32000 100 128
fs.file-max=65536
kern.maxfiles=5
kern.maxfilesperproc=5
net.ipv4.ip_local_port_range=1024 65535
net.ipv4.tcp_tw_recycle=1
net.ipv4.tcp_fin_timeout=10
net.ipv4.tcp_tw_reuse=1
net.core.rmem_max=16777216
net.core.wmem_max=16777216
net.ipv4.tcp_max_syn_backlog=4096
net.ipv4.tcp_syncookies=1
kernel.sched_migration_cost_ns=500
kernel.sched_migration_cost_ns=500
kernel.sched_autogroup_enabled=0
vm.swappiness=10


[postgresql.conf]
max_connections = 10
max_files_per_process = 100
shared_buffers = 24GB
max_locks_per_transaction  = 1000
effective_cache_size = 50GB
work_mem = 100MB
maintenance_work_mem = 2GB
log_min_duration_statement = 1
checkpoint_completion_target = 0.9
wal_buffers = 32MB
default_statistics_target = 100
listen_addresses = '*'
port = 5432
ssl = off
wal_sync_method = fdatasync
synchronous_commit = on
fsync = off
wal_level = minimal
#client_min_messages = fatal
#log_min_messages = fatal
#log_min_error_statement = fatal
datestyle = 'iso, mdy'
debug_pretty_print = off
debug_print_parse = off
debug_print_plan = off
debug_print_rewritten = off
default_text_search_config = 'pg_catalog.english'
enable_bitmapscan = on
enable_hashagg = on
enable_hashjoin = on
enable_indexonlyscan = on
enable_indexscan = on
enable_material = on
enable_mergejoin = on
enable_nestloop = on
enable_seqscan = on
enable_sort = on
enable_tidscan = on
from_collapse_limit = 8
geqo = on
geqo_threshold = 12
log_checkpoints = off
log_connections = off
log_disconnections = off
log_duration = off
log_executor_stats = off
log_hostname = off
log_parser_stats = off
log_planner_stats = off
log_replication_commands = off
log_statement_stats = off
log_timezone = 'UTC'
max_wal_size = 1GB
min_wal_size = 80MB
shared_preload_libraries = '$libdir/dbms_pipe,$libdir/edb_gen'
stats_temp_directory = 'pg_stat_tmp'
timezone = 'US/Pacific'
track_activities = on
track_counts = on
track_io_timing = off


Thanks in advance.


Re: [GENERAL] Book or other resource on Postgres-local code?

2016-12-29 Thread Charles Clavadetscher
Hi

 

From: pgsql-general-ow...@postgresql.org 
[mailto:pgsql-general-ow...@postgresql.org] On Behalf Of Guyren Howe
Sent: Donnerstag, 29. Dezember 2016 23:32
To: pgsql-general 
Subject: [GENERAL] Book or other resource on Postgres-local code?

 

As an aside from my last question about my LYDB effort:

 

https://medium.com/@gisborne/love-your-database-lydb-23c69f480a1d#.4jngp2rcb

 

I would like to find a book or other resource about SQL server-side programming 
(stored procedures etc) best practices in general and for Postgres in 
particular.

 

I found this book quite helpful (I have the first edition, but I assume that 
the second is even better):

 

https://www.packtpub.com/big-data-and-business-intelligence/postgresql-server-programming-second-edition/?utm_source=PoD
 

 _medium=referral_campaign=1783980583

 

Regards

Charles

 

 

Seems like a thing that should exist. I can find a variety of blog posts, 
mostly about Oracle and MS SQL Server, but nothing that appears particularly 
canonical or “classic”.

 

Is there such a resource?



Re: [GENERAL] Performance PLV8 vs PLPGSQL

2016-12-29 Thread Tim Uckun
I have datagrip and it's OK but it doesn't really do everything I want.

I don't understand why it doesn't fetch all objects from the database and
then put them into the disk in a directory so I can put it all under git
and then let me work on them syncing the files back as they change.  For
example today I just renamed a function. It didn't refactor properly by
identifying stored procs that reference it. If I was using another
jetbrains IDE it would have built an index of the project files and did a
proper refactor.

This would also allow you to make wholesale disk changes and then sync them
up properly to get around postgres dependency issues.

On Fri, Dec 30, 2016 at 1:40 PM, Michael Sheaver  wrote:

> If you want an IDE, Jetbrains, the makers of great IDEs like IntelliJ,
> PyCharm. and AppCode, among others, have recently come out with what is
> arguably the BEST IDE for DBAs, DataGrip. It runs on most major platforms,
> and is so good that I have bitten the bullet and paid the yearly
> subscription for it.
>
> Leave the Postgres core alone focus on what they do best, and that is
> making the very BEST database environment that can be had at any price,
> period. Is Postgres perfect? No, not at all. But no other group is so
> focused on performance, scalability and security as these folks are. And
> the rate of development, enhancement and continual improvement is, quite
> honestly, astounding.
>
> So here is my hat tip to the Postgres team for an awesome job they are
> doing!
>
> On Dec 29, 2016, at 7:19 PM, Tim Uckun  wrote:
>
> I am not saying the postgres core people should work on an IDE, just that
> an IDE like thing would be nice.
>
> On Fri, Dec 30, 2016 at 12:51 PM, Rob Sargent 
> wrote:
>
>> I would hope Postgres core folk take no more than a nanosecond to reject
>> the idea that they work on an IDE. Focus on reading and writing faster and
>> faster ACID all the while.
>>
>> On Dec 29, 2016, at 5:32 PM, Tim Uckun  wrote:
>>
>> Honestly I don't even like JS. Having said that I am not too crazy about
>> PL-PGSQL either. I am willing to put up with either given that they are
>> supported widely in default installs of postgres in AWS, Linux and MacOSX,
>>
>> As I said before, I think posgres gives a unique and underutilized
>> language platform. You can code in different languages, it has a good
>> variety of built in types, and of course you get persistance and caching
>> built in!  Using DBLINK you might even be able to separate out your code
>> from the bulk of your data in another database. Postgres all the way down!
>>
>> It's fun to play around with.  There is a lot of missing pieces though. A
>> good IDE like thing would be good, version control would be nice, deeper
>> namespacing (hierarchical schemas?), easier testing etc would go a long
>> way.
>>
>> Thanks for all the input guys!
>>
>> On Fri, Dec 30, 2016 at 12:14 AM, Ivan Sergio Borgonovo <
>> m...@webthatworks.it> wrote:
>>
>>> On 12/29/2016 10:35 AM, Pavel Stehule wrote:
>>>
>>> 2016-12-29 10:03 GMT+01:00 Tim Uckun >:

 I think it's awesome that postgres allows you to code in different
 languages like this. It really is a unique development environment
 and one that is overlooked as a development platform.  It would be
 nice if more languages were delivered in the default package
 especially lua, V8 and mruby.


 It is about dependencies and maintenance. There are not too much people
 who has good experience with C embedding Lua, V8 and others. Any people
 who can do some work are welcome.

 The living outside main package has disadvantages - only enthusiast
 knows about it, but some advantages too - you are not fixed on
 PostgreSQL development cycle, and development can be faster.

>>>
>>> I'll add my 2 cents.
>>>
>>> Postgresql and in general SQL are about integrity and coherency.
>>> Checking coherency is much easier with strict data type.
>>> PL/PGSQL gives you that, JS is far far away from that.
>>>
>>> Postgresql is a very flexible database and you can stretch it to do
>>> "MEAN like"[1] stuff but that's going to increase your "impedance mismatch".
>>>
>>> If you think there is some space for JS in your application stack that's
>>> nearer to the client rather than to the DB.
>>> Or possibly you need to do "MEAN like" stuff but you don't want to
>>> install another "database".
>>>
>>> As other said using stored procedures is a two edged sword.
>>> It can decouple DB schema from the application or it can increase the
>>> coupling.
>>> Choosing JS for performance in the stored procedure realm is going to
>>> encourage coupling and make scalability harder and it is going to become a
>>> mess when you'll need to refactor.
>>>
>>> [1] https://en.wikipedia.org/wiki/MEAN_(software_bundle)
>>>
>>> --
>>> Ivan Sergio 

[GENERAL] LYDB: Feasible to use PG roles instead of application-level security?

2016-12-29 Thread Guyren Howe
Further to my attempts to enlighten application developers about what they 
might better do in the database:

https://medium.com/@gisborne/love-your-database-lydb-23c69f480a1d#.4jngp2rcb 


it occurs to me to wonder whether it is practical to use PG’s own roles and 
security model in lieu of using an application-level one.

It seems that the role system in PG is sufficient for most general purposes. 
One could presumably also have a table with role names and associated 
metainformation (email address etc) as needed.

If I have a system with many thousands of users, is it practical to manage 
these users’ authentication and authorization using *just* Postgres?

It occurs to me that some client frameworks might have issues with their 
connection pools if those connections keep switching users, assuming they even 
can, but let’s set that aside for now. Or perhaps every connection could 
immediately do a SET USER before executing its connection?

This seems an attractive proposition from a security standpoint: if I use 
row-level security pervasively, I can have a security system that’s nestled 
nice and close to the data and presumably tricky to work around from a hacker 
given direct access only to the client application.

Is this practical? Has anyone here done it? What might the caveats be?

TIA

Re: [GENERAL] Special index for "like"-based query

2016-12-29 Thread David G. Johnston
On Thu, Dec 29, 2016 at 4:51 PM, Tomas Vondra 
wrote:

> On 12/30/2016 12:46 AM, David G. Johnston wrote:
>
>> On Thu, Dec 29, 2016 at 4:38 PM, Tomas Vondra
>> > >>wrote:
>>
>> On 12/30/2016 12:33 AM, David G. Johnston wrote:
>>
>> On Thu, Dec 29, 2016 at 4:21 PM, Job > 
>> > >>wrote:
>>
>> Hello,
>>
>> in Postgresql 9.6 we have a query running on a very large
>> table
>> based, in some cases, on a like statement:
>>
>> ... WHERE FIELD='CODE' OR FIELD LIKE 'CODE_%'
>>
>> Which type of index can i create to speed to the search when
>> the
>> "like" case happens?
>>
>>
>> ​GIST​
>>
>> https://www.postgresql.org/docs/9.6/static/pgtrgm.html
>> 
>>
>> ​https://www.postgresql.org/docs/9.6/static/btree-gist.html
>> 
>> ​
>>
>>
>> For prefix queries, it's also possible to use simple btree index
>> with varchar_pattern_ops.
>>
>> https://www.postgresql.org/docs/9.6/static/indexes-opclass.html
>> 
>>
>>
>> Even knowing that this feature exists I don't know that I could have
>> found it within a reasonable amount of time in its present location.  A
>> few cross-references from elsewhere (probably at least the functions
>> part of the documentation) would make learning about the capability a
>> lot easier.
>>
>>
> Well, it's referenced right from the "Indexes" part of the documentation
> (right at the beginning of "Index Types"):
>
> https://www.postgresql.org/docs/9.6/static/indexes.html
>
>
​While I may have an understanding of what operator classes and families
are when I am in my SQL thinking mode those terms don't really come to
mind.  Maybe part of the problem is that SQL doesn't have indexes and so my
formal education never covered them.  I learned how to use "CREATE INDEX"
to meet most common needs but the fact that I'm getting a b-tree family
index is well hidden.

While I'm all for learning the theory a more prescriptive approach (do this
to get an index that ​will allow prefix LIKEs to use it - see this section
for detail) to the topic would be welcome.  Tell the user how to use an
index when they are learning about the feature that they care about - LIKE
- not require them to learn all about indexes and later realize/remember
that one particular incantation will solve the LIKE problem.

David J.


Re: [GENERAL] Performance PLV8 vs PLPGSQL

2016-12-29 Thread Michael Sheaver
If you want an IDE, Jetbrains, the makers of great IDEs like IntelliJ, PyCharm. 
and AppCode, among others, have recently come out with what is arguably the 
BEST IDE for DBAs, DataGrip. It runs on most major platforms, and is so good 
that I have bitten the bullet and paid the yearly subscription for it.

Leave the Postgres core alone focus on what they do best, and that is making 
the very BEST database environment that can be had at any price, period. Is 
Postgres perfect? No, not at all. But no other group is so focused on 
performance, scalability and security as these folks are. And the rate of 
development, enhancement and continual improvement is, quite honestly, 
astounding. 

So here is my hat tip to the Postgres team for an awesome job they are doing!

> On Dec 29, 2016, at 7:19 PM, Tim Uckun  wrote:
> 
> I am not saying the postgres core people should work on an IDE, just that an 
> IDE like thing would be nice.
> 
> On Fri, Dec 30, 2016 at 12:51 PM, Rob Sargent  > wrote:
> I would hope Postgres core folk take no more than a nanosecond to reject the 
> idea that they work on an IDE. Focus on reading and writing faster and faster 
> ACID all the while. 
> 
> On Dec 29, 2016, at 5:32 PM, Tim Uckun  > wrote:
> 
>> Honestly I don't even like JS. Having said that I am not too crazy about 
>> PL-PGSQL either. I am willing to put up with either given that they are 
>> supported widely in default installs of postgres in AWS, Linux and MacOSX,
>> 
>> As I said before, I think posgres gives a unique and underutilized language 
>> platform. You can code in different languages, it has a good variety of 
>> built in types, and of course you get persistance and caching built in!  
>> Using DBLINK you might even be able to separate out your code from the bulk 
>> of your data in another database. Postgres all the way down!
>> 
>> It's fun to play around with.  There is a lot of missing pieces though. A 
>> good IDE like thing would be good, version control would be nice, deeper 
>> namespacing (hierarchical schemas?), easier testing etc would go a long way. 
>> 
>> Thanks for all the input guys! 
>> 
>> On Fri, Dec 30, 2016 at 12:14 AM, Ivan Sergio Borgonovo 
>> > wrote:
>> On 12/29/2016 10:35 AM, Pavel Stehule wrote:
>> 
>> 2016-12-29 10:03 GMT+01:00 Tim Uckun > 
>> >>:
>> 
>> I think it's awesome that postgres allows you to code in different
>> languages like this. It really is a unique development environment
>> and one that is overlooked as a development platform.  It would be
>> nice if more languages were delivered in the default package
>> especially lua, V8 and mruby.
>> 
>> 
>> It is about dependencies and maintenance. There are not too much people
>> who has good experience with C embedding Lua, V8 and others. Any people
>> who can do some work are welcome.
>> 
>> The living outside main package has disadvantages - only enthusiast
>> knows about it, but some advantages too - you are not fixed on
>> PostgreSQL development cycle, and development can be faster.
>> 
>> I'll add my 2 cents.
>> 
>> Postgresql and in general SQL are about integrity and coherency.
>> Checking coherency is much easier with strict data type.
>> PL/PGSQL gives you that, JS is far far away from that.
>> 
>> Postgresql is a very flexible database and you can stretch it to do "MEAN 
>> like"[1] stuff but that's going to increase your "impedance mismatch".
>> 
>> If you think there is some space for JS in your application stack that's 
>> nearer to the client rather than to the DB.
>> Or possibly you need to do "MEAN like" stuff but you don't want to install 
>> another "database".
>> 
>> As other said using stored procedures is a two edged sword.
>> It can decouple DB schema from the application or it can increase the 
>> coupling.
>> Choosing JS for performance in the stored procedure realm is going to 
>> encourage coupling and make scalability harder and it is going to become a 
>> mess when you'll need to refactor.
>> 
>> [1] https://en.wikipedia.org/wiki/MEAN_(software_bundle) 
>> 
>> 
>> -- 
>> Ivan Sergio Borgonovo
>> http://www.webthatworks.it  
>> http://www.borgonovo.net 
>> 
>> 
>> 
>> 
>> -- 
>> Sent via pgsql-general mailing list (pgsql-general@postgresql.org 
>> )
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-general 
>> 
>> 
> 



Re: [GENERAL] Performance PLV8 vs PLPGSQL

2016-12-29 Thread Tim Uckun
I am not saying the postgres core people should work on an IDE, just that
an IDE like thing would be nice.

On Fri, Dec 30, 2016 at 12:51 PM, Rob Sargent  wrote:

> I would hope Postgres core folk take no more than a nanosecond to reject
> the idea that they work on an IDE. Focus on reading and writing faster and
> faster ACID all the while.
>
> On Dec 29, 2016, at 5:32 PM, Tim Uckun  wrote:
>
> Honestly I don't even like JS. Having said that I am not too crazy about
> PL-PGSQL either. I am willing to put up with either given that they are
> supported widely in default installs of postgres in AWS, Linux and MacOSX,
>
> As I said before, I think posgres gives a unique and underutilized
> language platform. You can code in different languages, it has a good
> variety of built in types, and of course you get persistance and caching
> built in!  Using DBLINK you might even be able to separate out your code
> from the bulk of your data in another database. Postgres all the way down!
>
> It's fun to play around with.  There is a lot of missing pieces though. A
> good IDE like thing would be good, version control would be nice, deeper
> namespacing (hierarchical schemas?), easier testing etc would go a long
> way.
>
> Thanks for all the input guys!
>
> On Fri, Dec 30, 2016 at 12:14 AM, Ivan Sergio Borgonovo <
> m...@webthatworks.it> wrote:
>
>> On 12/29/2016 10:35 AM, Pavel Stehule wrote:
>>
>> 2016-12-29 10:03 GMT+01:00 Tim Uckun >> >:
>>>
>>> I think it's awesome that postgres allows you to code in different
>>> languages like this. It really is a unique development environment
>>> and one that is overlooked as a development platform.  It would be
>>> nice if more languages were delivered in the default package
>>> especially lua, V8 and mruby.
>>>
>>>
>>> It is about dependencies and maintenance. There are not too much people
>>> who has good experience with C embedding Lua, V8 and others. Any people
>>> who can do some work are welcome.
>>>
>>> The living outside main package has disadvantages - only enthusiast
>>> knows about it, but some advantages too - you are not fixed on
>>> PostgreSQL development cycle, and development can be faster.
>>>
>>
>> I'll add my 2 cents.
>>
>> Postgresql and in general SQL are about integrity and coherency.
>> Checking coherency is much easier with strict data type.
>> PL/PGSQL gives you that, JS is far far away from that.
>>
>> Postgresql is a very flexible database and you can stretch it to do "MEAN
>> like"[1] stuff but that's going to increase your "impedance mismatch".
>>
>> If you think there is some space for JS in your application stack that's
>> nearer to the client rather than to the DB.
>> Or possibly you need to do "MEAN like" stuff but you don't want to
>> install another "database".
>>
>> As other said using stored procedures is a two edged sword.
>> It can decouple DB schema from the application or it can increase the
>> coupling.
>> Choosing JS for performance in the stored procedure realm is going to
>> encourage coupling and make scalability harder and it is going to become a
>> mess when you'll need to refactor.
>>
>> [1] https://en.wikipedia.org/wiki/MEAN_(software_bundle)
>>
>> --
>> Ivan Sergio Borgonovo
>> http://www.webthatworks.it http://www.borgonovo.net
>>
>>
>>
>>
>> --
>> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-general
>>
>
>


Re: [GENERAL] Performance PLV8 vs PLPGSQL

2016-12-29 Thread Rob Sargent
I would hope Postgres core folk take no more than a nanosecond to reject the 
idea that they work on an IDE. Focus on reading and writing faster and faster 
ACID all the while. 

> On Dec 29, 2016, at 5:32 PM, Tim Uckun  wrote:
> 
> Honestly I don't even like JS. Having said that I am not too crazy about 
> PL-PGSQL either. I am willing to put up with either given that they are 
> supported widely in default installs of postgres in AWS, Linux and MacOSX,
> 
> As I said before, I think posgres gives a unique and underutilized language 
> platform. You can code in different languages, it has a good variety of built 
> in types, and of course you get persistance and caching built in!  Using 
> DBLINK you might even be able to separate out your code from the bulk of your 
> data in another database. Postgres all the way down!
> 
> It's fun to play around with.  There is a lot of missing pieces though. A 
> good IDE like thing would be good, version control would be nice, deeper 
> namespacing (hierarchical schemas?), easier testing etc would go a long way. 
> 
> Thanks for all the input guys! 
> 
>> On Fri, Dec 30, 2016 at 12:14 AM, Ivan Sergio Borgonovo 
>>  wrote:
>> On 12/29/2016 10:35 AM, Pavel Stehule wrote:
>> 
>>> 2016-12-29 10:03 GMT+01:00 Tim Uckun >> >:
>>> 
>>> I think it's awesome that postgres allows you to code in different
>>> languages like this. It really is a unique development environment
>>> and one that is overlooked as a development platform.  It would be
>>> nice if more languages were delivered in the default package
>>> especially lua, V8 and mruby.
>>> 
>>> 
>>> It is about dependencies and maintenance. There are not too much people
>>> who has good experience with C embedding Lua, V8 and others. Any people
>>> who can do some work are welcome.
>>> 
>>> The living outside main package has disadvantages - only enthusiast
>>> knows about it, but some advantages too - you are not fixed on
>>> PostgreSQL development cycle, and development can be faster.
>> 
>> I'll add my 2 cents.
>> 
>> Postgresql and in general SQL are about integrity and coherency.
>> Checking coherency is much easier with strict data type.
>> PL/PGSQL gives you that, JS is far far away from that.
>> 
>> Postgresql is a very flexible database and you can stretch it to do "MEAN 
>> like"[1] stuff but that's going to increase your "impedance mismatch".
>> 
>> If you think there is some space for JS in your application stack that's 
>> nearer to the client rather than to the DB.
>> Or possibly you need to do "MEAN like" stuff but you don't want to install 
>> another "database".
>> 
>> As other said using stored procedures is a two edged sword.
>> It can decouple DB schema from the application or it can increase the 
>> coupling.
>> Choosing JS for performance in the stored procedure realm is going to 
>> encourage coupling and make scalability harder and it is going to become a 
>> mess when you'll need to refactor.
>> 
>> [1] https://en.wikipedia.org/wiki/MEAN_(software_bundle)
>> 
>> -- 
>> Ivan Sergio Borgonovo
>> http://www.webthatworks.it http://www.borgonovo.net
>> 
>> 
>> 
>> 
>> -- 
>> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-general
> 


Re: [GENERAL] Special index for "like"-based query

2016-12-29 Thread Tomas Vondra

On 12/30/2016 12:46 AM, David G. Johnston wrote:

On Thu, Dec 29, 2016 at 4:38 PM, Tomas Vondra
>wrote:

On 12/30/2016 12:33 AM, David G. Johnston wrote:

On Thu, Dec 29, 2016 at 4:21 PM, Job 
>>wrote:

Hello,

in Postgresql 9.6 we have a query running on a very large table
based, in some cases, on a like statement:

... WHERE FIELD='CODE' OR FIELD LIKE 'CODE_%'

Which type of index can i create to speed to the search when the
"like" case happens?


​GIST​

https://www.postgresql.org/docs/9.6/static/pgtrgm.html


​https://www.postgresql.org/docs/9.6/static/btree-gist.html

​


For prefix queries, it's also possible to use simple btree index
with varchar_pattern_ops.

https://www.postgresql.org/docs/9.6/static/indexes-opclass.html



Even knowing that this feature exists I don't know that I could have
found it within a reasonable amount of time in its present location.  A
few cross-references from elsewhere (probably at least the functions
part of the documentation) would make learning about the capability a
lot easier.



Well, it's referenced right from the "Indexes" part of the documentation 
(right at the beginning of "Index Types"):


https://www.postgresql.org/docs/9.6/static/indexes.html

regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Special index for "like"-based query

2016-12-29 Thread David G. Johnston
On Thu, Dec 29, 2016 at 4:38 PM, Tomas Vondra 
wrote:

> On 12/30/2016 12:33 AM, David G. Johnston wrote:
>
>> On Thu, Dec 29, 2016 at 4:21 PM, Job > >wrote:
>>
>> Hello,
>>
>> in Postgresql 9.6 we have a query running on a very large table
>> based, in some cases, on a like statement:
>>
>> ... WHERE FIELD='CODE' OR FIELD LIKE 'CODE_%'
>>
>> Which type of index can i create to speed to the search when the
>> "like" case happens?
>>
>>
>> ​GIST​
>>
>> https://www.postgresql.org/docs/9.6/static/pgtrgm.html
>>
>> ​https://www.postgresql.org/docs/9.6/static/btree-gist.html
>> ​
>>
>
> For prefix queries, it's also possible to use simple btree index with
> varchar_pattern_ops.
>
> https://www.postgresql.org/docs/9.6/static/indexes-opclass.html
>
>
Even knowing that this feature exists I don't know that I could have found
it within a reasonable amount of time in its present location.  A few
cross-references from elsewhere (probably at least the functions part of
the documentation) would make learning about the capability a lot easier.

David J.
​


Re: [GENERAL] Special index for "like"-based query

2016-12-29 Thread Tomas Vondra

On 12/30/2016 12:33 AM, David G. Johnston wrote:

On Thu, Dec 29, 2016 at 4:21 PM, Job >wrote:

Hello,

in Postgresql 9.6 we have a query running on a very large table
based, in some cases, on a like statement:

... WHERE FIELD='CODE' OR FIELD LIKE 'CODE_%'

Which type of index can i create to speed to the search when the
"like" case happens?


​GIST​

https://www.postgresql.org/docs/9.6/static/pgtrgm.html

​https://www.postgresql.org/docs/9.6/static/btree-gist.html
​


For prefix queries, it's also possible to use simple btree index with 
varchar_pattern_ops.


https://www.postgresql.org/docs/9.6/static/indexes-opclass.html


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Special index for "like"-based query

2016-12-29 Thread David G. Johnston
On Thu, Dec 29, 2016 at 4:21 PM, Job  wrote:

> Hello,
>
> in Postgresql 9.6 we have a query running on a very large table based, in
> some cases, on a like statement:
>
> ... WHERE FIELD='CODE' OR FIELD LIKE 'CODE_%'
>
> Which type of index can i create to speed to the search when the "like"
> case happens?
>

​GIST​

https://www.postgresql.org/docs/9.6/static/pgtrgm.html

​https://www.postgresql.org/docs/9.6/static/btree-gist.html
​
David J.


Re: [GENERAL] Performance PLV8 vs PLPGSQL

2016-12-29 Thread Tim Uckun
Honestly I don't even like JS. Having said that I am not too crazy about
PL-PGSQL either. I am willing to put up with either given that they are
supported widely in default installs of postgres in AWS, Linux and MacOSX,

As I said before, I think posgres gives a unique and underutilized language
platform. You can code in different languages, it has a good variety of
built in types, and of course you get persistance and caching built in!
Using DBLINK you might even be able to separate out your code from the bulk
of your data in another database. Postgres all the way down!

It's fun to play around with.  There is a lot of missing pieces though. A
good IDE like thing would be good, version control would be nice, deeper
namespacing (hierarchical schemas?), easier testing etc would go a long
way.

Thanks for all the input guys!

On Fri, Dec 30, 2016 at 12:14 AM, Ivan Sergio Borgonovo <
m...@webthatworks.it> wrote:

> On 12/29/2016 10:35 AM, Pavel Stehule wrote:
>
> 2016-12-29 10:03 GMT+01:00 Tim Uckun > >:
>>
>> I think it's awesome that postgres allows you to code in different
>> languages like this. It really is a unique development environment
>> and one that is overlooked as a development platform.  It would be
>> nice if more languages were delivered in the default package
>> especially lua, V8 and mruby.
>>
>>
>> It is about dependencies and maintenance. There are not too much people
>> who has good experience with C embedding Lua, V8 and others. Any people
>> who can do some work are welcome.
>>
>> The living outside main package has disadvantages - only enthusiast
>> knows about it, but some advantages too - you are not fixed on
>> PostgreSQL development cycle, and development can be faster.
>>
>
> I'll add my 2 cents.
>
> Postgresql and in general SQL are about integrity and coherency.
> Checking coherency is much easier with strict data type.
> PL/PGSQL gives you that, JS is far far away from that.
>
> Postgresql is a very flexible database and you can stretch it to do "MEAN
> like"[1] stuff but that's going to increase your "impedance mismatch".
>
> If you think there is some space for JS in your application stack that's
> nearer to the client rather than to the DB.
> Or possibly you need to do "MEAN like" stuff but you don't want to install
> another "database".
>
> As other said using stored procedures is a two edged sword.
> It can decouple DB schema from the application or it can increase the
> coupling.
> Choosing JS for performance in the stored procedure realm is going to
> encourage coupling and make scalability harder and it is going to become a
> mess when you'll need to refactor.
>
> [1] https://en.wikipedia.org/wiki/MEAN_(software_bundle)
>
> --
> Ivan Sergio Borgonovo
> http://www.webthatworks.it http://www.borgonovo.net
>
>
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>


[GENERAL] Special index for "like"-based query

2016-12-29 Thread Job
Hello,

in Postgresql 9.6 we have a query running on a very large table based, in some 
cases, on a like statement:

... WHERE FIELD='CODE' OR FIELD LIKE 'CODE_%'

Which type of index can i create to speed to the search when the "like" case 
happens?

Thank you!
/F


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Performance PLV8 vs PLPGSQL

2016-12-29 Thread Gavin Flower

On 29/12/16 09:12, Francisco Olarte wrote:

On Wed, Dec 28, 2016 at 5:53 PM, Jan de Visser  wrote:

.but the term "impedance mismatch"
is at least 25 year old;

Much older, I was told it in class at least 32 years ago.


as far as I know it was coined

_Borrowed_ from electrical engineering / communication techs.

It is used to highlight how signals 'bounce' at the points of a
transmision path where impedances do not match. It extrapolates the
fact that if you have a battery with an internal resistance R the way
to extract the maximum energy on a load is for it to match the
impedance, be R too. Higher load impedance and the fraction of energy
in the load goes up, the total down. Lower load impedance and the
fraction in the load goes down, the total up. In either case absolute
power in the load goes down. Match the impedance and the energy in the
load is the maximum ( and equal to the internal loss in the battery ).

[...]

From my distant memory of studying AC stuff at University many moons ago...

You want the two impedances to be complex conjugates of each other (this 
means the MAGNITUDES will be equal) - which means the phase change 
should be equal & opposite, and the resistance to match.



Cheers,
Gavin



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Book or other resource on Postgres-local code?

2016-12-29 Thread Rich Shepard

On Thu, 29 Dec 2016, Guyren Howe wrote:


I would like to find a book or other resource about SQL server-side
programming (stored procedures etc) best practices in general and for
Postgres in particular.


  Start here:


Rich


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Book or other resource on Postgres-local code?

2016-12-29 Thread Guyren Howe
As an aside from my last question about my LYDB effort:

https://medium.com/@gisborne/love-your-database-lydb-23c69f480a1d#.4jngp2rcb 


I would like to find a book or other resource about SQL server-side programming 
(stored procedures etc) best practices in general and for Postgres in 
particular.

Seems like a thing that should exist. I can find a variety of blog posts, 
mostly about Oracle and MS SQL Server, but nothing that appears particularly 
canonical or “classic”.

Is there such a resource?

[GENERAL] Write-optimized data structures

2016-12-29 Thread selforganized
Hi,

Does Postgresql have any write-optimized data structure like LSM-tree? if
not is there any plan to implement that?

I'm building a write-heavy OLTP application. I'm looking at
write-optimized databases like MyRocks, TokuDB, and Cassandra but I'd
prefer to stay within Postgresql. Would PG makes sense for such usage?

Thanks
Andy


Re: [GENERAL] postgres pg_bulkload c filter function in c programming

2016-12-29 Thread rajmhn
Thanks Francis.

Meanwhile just gone through usage of perl in postgres function. In turn,
this Postgres function can be called in pg_bulkload. I think, this task can
be done. Let me give a try.



--
View this message in context: 
http://postgresql.nabble.com/postgres-pg-bulkload-c-filter-function-in-c-programming-tp5936651p5936707.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] postgres pg_bulkload c filter function in c programming

2016-12-29 Thread rajmhn
Yes. 

I can able to apply those mentioned transformation in pgloader &
pg_bulkload-SQL filter. Yet to take performance stats. Meanwhile, I'm trying
to figure out the other best possible option. We are counting more on
performance, error & audit handling.

Thanks 



 




--
View this message in context: 
http://postgresql.nabble.com/postgres-pg-bulkload-c-filter-function-in-c-programming-tp5936651p5936705.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] postgres pg_bulkload c filter function in c programming

2016-12-29 Thread Alvaro Herrera
rajmhn wrote:

> But, how this can be accomplished when it have 100's of columns from source.
> Need to apply transformations only for few columns as mentioned above. 

Did you try pgloader?  See http://pgloader.io/

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] postgres pg_bulkload c filter function in c programming

2016-12-29 Thread Francisco Olarte
On Thu, Dec 29, 2016 at 8:41 PM, rajmhn  wrote:
> Thanks Francis.That seems to be a good solution.

Yep, but not for your problem as ...

>
> Thought to use pg_bulkload, a third party library instead of copy, where
> reject handling can be done in efficient way.

Mine was just an idea to do the part of the load you described
assuming pg_bulkload usage was optional. Not being it, it will not
work. MAYBE you can use the technique to preprocess the files for
pg_bulkload ( if possible this is nice, as the goood thing of
preprocessing them is you repeat until you get them right, no
DB-touchy ).

> Transformation(FILTER)
> functions can be implemented with any languages in pg_bulkload before it was
> loaded to table. SQL, C, PLs are ok, but you should write functions as fast
> as possible because they are called many times.



> In this case, function should be written in Perl and called inside the
> Postgressql function. Do you think that will work it out? But pg_bulkload is
> preferring C function over SQL function for performance.

I'm not familiar with pg_bulkload usage. I've read about it but all my
loading problemas have been solved better by using copy ( especially
factoring total time, I already know to use copy and a couple dozen
languages in which to write filters to preclean data for copy. In the
time I learn enough of pg_bulkload I can load filter and load a lot of
data ).

Regarding C vs perl, it seems pg_bulkload does server side processing.
In the server the funcion calling overhead is HUGE, specially when
transitioning between different languages. IMO the time spent doing
the data processing in perl would be 0 when compared with the time to
pass the data around to perl. C will be faster because the calling
barrier is smaller inside the server.

Just for data processing of things like you I've normally found
filters like the one I described can easily saturate an SSD array, and
the difference in time for processing is dwarfed by the difference in
time for developing the filter. In fact in any modern OS with write
through and readahead disk management the normal difference between
filtering in perl or C is perl may use 10% of 1 core, C 1%, perl
filter is developed in 15 minutes, C in an hour, and perl filter takes
some extra milliseconds to start. AND, if you are not familiar with
processing data in C you can easily code a slower solution than in
perl ( as perl was dessigned for this ).


> I will try this option as you suggested.

Just remember my option is not using pg_bulkload with perl stored
procedures. I cannot recommend anything if you use pg_bulkload.

I suggested using copy and perl to preclean the data. It just seemed
to me from the description of your problem you were using a too
complex tool. Now that you are introducing new terms, like reject
handling, I'll step out until I can make a sugestion ( don't bother to
define it for me, it seems a bulkload related term and I'm not able to
study that tool ).


FrancisCO Olarte.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] postgres pg_bulkload c filter function in c programming

2016-12-29 Thread rajmhn
Thanks Francis.That seems to be a good solution. 

Thought to use pg_bulkload, a third party library instead of copy, where
reject handling can be done in efficient way. Transformation(FILTER)
functions can be implemented with any languages in pg_bulkload before it was
loaded to table. SQL, C, PLs are ok, but you should write functions as fast
as possible because they are called many times.

In this case, function should be written in Perl and called inside the
Postgressql function. Do you think that will work it out? But pg_bulkload is
preferring C function over SQL function for performance.

I will try this option as you suggested.

Thanks



--
View this message in context: 
http://postgresql.nabble.com/postgres-pg-bulkload-c-filter-function-in-c-programming-tp5936651p5936695.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] postgres pg_bulkload c filter function in c programming

2016-12-29 Thread Francisco Olarte
Hi:

On Thu, Dec 29, 2016 at 3:01 PM, Jan de Visser  wrote:
> On Thursday, December 29, 2016 5:10:08 AM EST rajmhn wrote:
...
>> I'm new to C. Gone through this documentation. Not clear, how to start.
...
> It seems to me it would be much easier to load the data into a temporary
> table, and from there transform into the form you need it to be. If you're not
> experienced in C (and probably even if you are) you can import *a lot* of data
> in the time it would take you to develop that custom filter.
> Besides, your requirements don't appear to be that difficult.

for his type of requirements I would recommend going the perl ( any
similar language will do, but after using it for 25 years I find it is
the king for that ) + pipe route. I mean, write a script which spits
"copy xxx from stdin\n" plus the transformed rows, pipe it trough
psql.

Total time is difficult to beat, as debugging is very easy, write the
filtering function using <> - chomp - split - s/// - join - print and
debug it by feeding it some lines with head.

Untested code from which I remembter of the specs, could be something like:

print "Copy my_table from stdin;\n";
my $e = "The magic col-e default value";
while(<>) {
  chomp; # get rid of OPTIONAl line terminator
  # Get the columns and do the c-d column swap
  my ($a,$b,$d,$c)=split /\|/, $_;
  # Add dashed to the date in column b:
  $b =~ s/^(\d\d\d\d)(\d\d)(\d\d)$/$1-$2-$3/;
  # zap not numerics a to 0:
  ($a=~/^\d+$/) or ($a = 0);
  # And send it with the default separators ( scaping left as an
exercise to the reader, ask if needed, I've wrtten and tested it
several times ).
  print join("\t", $a,$b,$c,$d,$e),"\n";
}

Test it with "head the-datafile | perl the_script.pl |
my_favourite_pager" until correct, the beauty of this approache  is
you do not touch the db in debug, feed it to psql when done. In my
experience the perl script overhead is unnoticeable in any 2k+ machine
(and perl was specifically dessigned to be good at this kind of things
).

Francisco Olarte.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] vacuumdb --analyze-only scans all pages?

2016-12-29 Thread Gerhard Wiesinger

On 29.12.2016 16:10, Tom Lane wrote:

Adrian Klaver  writes:

On 12/28/2016 11:54 PM, Gerhard Wiesinger wrote:

vacuumdb --analyze-only --all --verbose
INFO:  analyzing "public.log"
INFO:  "log": scanned 3 of 30851 pages, containing 3599899 live rows
and 0 dead rows; 3 rows in sample, 3702016 estimated total rows
INFO:  analyzing "public.log_details"
INFO:  "log_details": scanned 2133350 of 2133350 pages, containing
334935843 live rows and 0 dead rows; 300 rows in sample, 334935843
estimated total rows
INFO:  analyzing "public.log_details_str"
INFO:  "log_details_str": scanned 3 of 521126 pages, containing
3601451 live rows and 0 dead rows; 3 rows in sample, 62560215
estimated total rows

Any ideas why?

I would say because the '300 rows in sample' where spread out over
all 2133350 pages.

Worth pointing out here is that you must have a custom statistics target
set on log_details to make it want a sample so much larger than the
default.  If you feel ANALYZE is taking too long, you should reconsider
whether you need such a large target.


Thanx Tom and Adrian

Yes, there is a custom statistic target of 1 set, I guess for some 
reasons some time ago to overcome a performance problem after upgrade 
from 8.3 to 8.4.

Thanx Tom for pointing that out.

Good query to find it out:
SELECT
  n.nspname AS schemaname,
  CASE
   WHEN cl.relkind = 'r' THEN CAST('TABLE' AS VARCHAR(20))
   WHEN cl.relkind = 'i' THEN CAST('INDEX' AS VARCHAR(20))
   WHEN cl.relkind = 'S' THEN CAST('SEQUENCE' AS VARCHAR(20))
   WHEN cl.relkind = 's' THEN CAST('SPECIAL' AS VARCHAR(20))
   WHEN cl.relkind = 'v' THEN CAST('VIEW' AS VARCHAR(20))
   WHEN cl.relkind = 't' THEN CAST('TOAST TABLE' AS VARCHAR(18))
   ELSE null
  END AS object_type,
  cl.relname,
  attname,
  attstattarget
FROM
  pg_attribute a
LEFT OUTER JOIN pg_class cl ON a.attrelid = cl.oid
LEFT OUTER JOIN pg_namespace n ON n.oid = cl.relnamespace
WHERE
  (cl.relkind = 'r' OR cl.relkind = 'i' OR cl.relkind = 't')
  AND attnum > 0 -- only regular columns
  AND n.nspname = 'public'  -- public schema only
  AND NOT(relname ILIKE 'pgstatspack_%')
  AND cl.relkind = 'r' -- TABLE
  AND attstattarget <> -1 -- non default values only
ORDER BY
  n.nspname,
  cl.relname,
  attnum
;

BTW: It looks like that the statistics target is multiplied by 300 to 
get the number of rows, is that true (didn't find any documentation 
about that)?

https://www.postgresql.org/docs/current/static/planner-stats.html
-- ALTER TABLE public.log_details ALTER COLUMN fk_id SET STATISTICS 1;
-- ALTER TABLE public.log_details ALTER COLUMN fk_keyid SET STATISTICS 
1;

-- Default is 100, means 300*100=3 rows (30k)
-- Max ss 1, means 300*1=300 rows (3 Mio)
ALTER TABLE public.log_details ALTER COLUMN fk_id SET STATISTICS -1;
ALTER TABLE public.log_details ALTER COLUMN fk_keyid SET STATISTICS -1;
https://www.postgresql.org/docs/current/static/runtime-config-query.html#GUC-DEFAULT-STATISTICS-TARGET

Thnx.

Ciao,
Gerhard



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] LYDB: What advice about stored procedures and other server side code?

2016-12-29 Thread Adrian Klaver

On 12/29/2016 02:12 AM, Christoph Moench-Tegeder wrote:

## Karsten Hilbert (karsten.hilb...@gmx.net):


Many applications are not designed to have a "stable" database API.

It seems OP is arguing they should.


Well, if the environment allows for that, fine. If not, well, duh.


Been following along and while there are several views there seems to be 
one constant: "there should be a method to the madness". For a 
particular situation a best try should be made to analyze what the 
current and future needs are and plan a course of action that meets 
those needs the best. This could be database-centric or app-centric. In 
a perfect world this would be a non-biased decision, though I suspect it 
tends to follow the inclinations/backgrounds of those involved in the 
planning. In any case a defined course of action that is set out and 
followed would seem to be better then arbitrarily bolting on 
new/different methods on a whim.





Regards,
Christoph




--
Adrian Klaver
adrian.kla...@aklaver.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] vacuumdb --analyze-only scans all pages?

2016-12-29 Thread Tom Lane
Adrian Klaver  writes:
> On 12/28/2016 11:54 PM, Gerhard Wiesinger wrote:
>> vacuumdb --analyze-only --all --verbose
>> INFO:  analyzing "public.log"
>> INFO:  "log": scanned 3 of 30851 pages, containing 3599899 live rows
>> and 0 dead rows; 3 rows in sample, 3702016 estimated total rows
>> INFO:  analyzing "public.log_details"
>> INFO:  "log_details": scanned 2133350 of 2133350 pages, containing
>> 334935843 live rows and 0 dead rows; 300 rows in sample, 334935843
>> estimated total rows
>> INFO:  analyzing "public.log_details_str"
>> INFO:  "log_details_str": scanned 3 of 521126 pages, containing
>> 3601451 live rows and 0 dead rows; 3 rows in sample, 62560215
>> estimated total rows
>> 
>> Any ideas why?

> I would say because the '300 rows in sample' where spread out over 
> all 2133350 pages.

Worth pointing out here is that you must have a custom statistics target
set on log_details to make it want a sample so much larger than the
default.  If you feel ANALYZE is taking too long, you should reconsider
whether you need such a large target.

regards, tom lane


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] vacuumdb --analyze-only scans all pages?

2016-12-29 Thread Adrian Klaver

On 12/28/2016 11:54 PM, Gerhard Wiesinger wrote:

Hello,

PostgreSQl  9.6.1: after a pg_dump/restore procedure it scans all pages
(at least for some of the tables, analyze-only switch is specified).

I would expect that only the sample rows are scanned.

"log_details": scanned 2133350 of 2133350 pages

vacuumdb --analyze-only --all --verbose
INFO:  analyzing "public.log"
INFO:  "log": scanned 3 of 30851 pages, containing 3599899 live rows
and 0 dead rows; 3 rows in sample, 3702016 estimated total rows
INFO:  analyzing "public.log_details"
INFO:  "log_details": scanned 2133350 of 2133350 pages, containing
334935843 live rows and 0 dead rows; 300 rows in sample, 334935843
estimated total rows
INFO:  analyzing "public.log_details_str"
INFO:  "log_details_str": scanned 3 of 521126 pages, containing
3601451 live rows and 0 dead rows; 3 rows in sample, 62560215
estimated total rows

Any ideas why?


I would say because the '300 rows in sample' where spread out over 
all 2133350 pages.




Thnx.

Ciao,

Gerhard






--
Adrian Klaver
adrian.kla...@aklaver.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Er Data Modeller for PostgreSQL

2016-12-29 Thread Scott Mead
On Thu, Dec 29, 2016 at 8:59 AM, Rich Shepard 
wrote:

> On Thu, 29 Dec 2016, Nicolas Paris wrote:
>
> Hi I'd like to tell about Sql Power Architect
>>
>
> Nicholas,
>
>   SPA was going to be my next re-examination after dbeaver. Since the
> latter
> easily accomplished what I needed I stopped there.
>

I've actually used Sql Power Architect before as well.  I like Schema Spy
because it's a quick read-only way to give me a fast report.  SQL Power
Architect is nice because it's graphical and you can interact with the
DBA.  I've also used DBVisualizer for this ( I actually love DBVisualizer
), but that tool is geared for interacting with the database and has
visualization as an extra.  I'll have to try dbeaver, I've heard of it,
but, haven't used it.

  Schema Spy is great, but, I do always have to come back and find the
exact command needed to kick it off, it's CLI is a bit over-complicated.

--Scott



>
> Thanks for the reminder,
>
>
> Rich
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>



-- 
--
Scott Mead
Sr. Architect
*OpenSCG *
http://openscg.com


Re: [GENERAL] postgres pg_bulkload c filter function in c programming

2016-12-29 Thread rajmhn
Thanks Jan

I'm converting the code from Oracle to Postgres. Both version of code will
be available for different users.

In Oracle, doing these kind of transformation in SQL loader. Need to follow
the same kind of approach in Postgres. SQL filter approach was very easy in
terms of coding. From documentation found, C filter was very much faster
than SQL.

I'm very new to C. Tried few samples, as mentioned in this link.

https://www.postgresql.org/docs/current/static/xfunc-c.html

It worked. 

But, how this can be accomplished when it have 100's of columns from source.
Need to apply transformations only for few columns as mentioned above. 

Totally struck, nobody is here to help. If you can guide me with some syntax
with the sample data provided above, will be grateful. I can manage it from
there.

Thanks 



--
View this message in context: 
http://postgresql.nabble.com/postgres-pg-bulkload-c-filter-function-in-c-programming-tp5936651p5936658.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] postgres pg_bulkload c filter function in c programming

2016-12-29 Thread Jan de Visser
On Thursday, December 29, 2016 5:10:08 AM EST rajmhn wrote:
> Gurus,
> 
> Reading the data from file and loading it using pg_bulkload- C filter. As
> per documentation, C filter is much faster than SQL filter.
> 
> I'm new to C. Gone through this documentation. Not clear, how to start.
> https://www.postgresql.org/docs/current/static/xfunc-c.html.
> 
> Can someone kindly guide me to create C code, so that it can be called in
> postgres function?
> 

It seems to me it would be much easier to load the data into a temporary 
table, and from there transform into the form you need it to be. If you're not 
experienced in C (and probably even if you are) you can import *a lot* of data 
in the time it would take you to develop that custom filter. 

Besides, your requirements don't appear to be that difficult.



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Er Data Modeller for PostgreSQL

2016-12-29 Thread Rich Shepard

On Thu, 29 Dec 2016, Nicolas Paris wrote:


Hi I'd like to tell about Sql Power Architect


Nicholas,

  SPA was going to be my next re-examination after dbeaver. Since the latter
easily accomplished what I needed I stopped there.

Thanks for the reminder,

Rich


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] postgres pg_bulkload c filter function in c programming

2016-12-29 Thread rajmhn
Gurus,

Reading the data from file and loading it using pg_bulkload- C filter. As
per documentation, C filter is much faster than SQL filter.

I'm new to C. Gone through this documentation. Not clear, how to start.
https://www.postgresql.org/docs/current/static/xfunc-c.html.

Can someone kindly guide me to create C code, so that it can be called in
postgres function?

Here is the sample data.File has no header.

Sample Data:
ABC|20170101|DEF ||GHIJ|KLM

Target Table Definition:
COLA numeric(5,0)
COLB date
COLC text
COLD text
COLE text

First column should be mapped to COLA
Second column should be mapped to COLB
Third column should be mapped to COLD
Fourth column should be mapped to COLC
Fifth column should be mapped to Some default value(column is not
present in source) 

Transformation:
a)First column should be mapped to COLA. It is numeric in target table.
If any alpha-characters were present, default this column with '0'.
Otherwise, source value should be moved to table.
b)Second column should be mapped to COLB. TO_DATE function from text
format. File will have date format as MMDD. It should be converted to
date.
c)Third column should be mapped to COLD.Need to Trim both leading and
trailing spaces.
d)Fourth column should be mapped to COLC. If it NULL, some value should
be defaulted.
e)Only few columns from source file should be loaded. In this case, only
first four columns should be loaded.
f)Different ordering in source files & target columns.In this case,
Third column should be mapped to COLD
Fourth column should be mapped to COLC
g)COLE should be loaded with default value. This column is not present
in source file.

These transformations, can be handled with query after loading all the data
as varchar and nullable. But we need to handle this before loading as like
we do in Oracle.

Do you think, these functionalities can be accomplished using C programming?
Some standard syntax for writing these functionalities would be greatly
helpful.

Thanks




--
View this message in context: 
http://postgresql.nabble.com/postgres-pg-bulkload-c-filter-function-in-c-programming-tp5936651.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Performance PLV8 vs PLPGSQL

2016-12-29 Thread Ivan Sergio Borgonovo

On 12/29/2016 10:35 AM, Pavel Stehule wrote:


2016-12-29 10:03 GMT+01:00 Tim Uckun >:

I think it's awesome that postgres allows you to code in different
languages like this. It really is a unique development environment
and one that is overlooked as a development platform.  It would be
nice if more languages were delivered in the default package
especially lua, V8 and mruby.


It is about dependencies and maintenance. There are not too much people
who has good experience with C embedding Lua, V8 and others. Any people
who can do some work are welcome.

The living outside main package has disadvantages - only enthusiast
knows about it, but some advantages too - you are not fixed on
PostgreSQL development cycle, and development can be faster.


I'll add my 2 cents.

Postgresql and in general SQL are about integrity and coherency.
Checking coherency is much easier with strict data type.
PL/PGSQL gives you that, JS is far far away from that.

Postgresql is a very flexible database and you can stretch it to do 
"MEAN like"[1] stuff but that's going to increase your "impedance mismatch".


If you think there is some space for JS in your application stack that's 
nearer to the client rather than to the DB.
Or possibly you need to do "MEAN like" stuff but you don't want to 
install another "database".


As other said using stored procedures is a two edged sword.
It can decouple DB schema from the application or it can increase the 
coupling.
Choosing JS for performance in the stored procedure realm is going to 
encourage coupling and make scalability harder and it is going to become 
a mess when you'll need to refactor.


[1] https://en.wikipedia.org/wiki/MEAN_(software_bundle)

--
Ivan Sergio Borgonovo
http://www.webthatworks.it http://www.borgonovo.net



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Er Data Modeller for PostgreSQL

2016-12-29 Thread Nicolas Paris
2016-12-29 1:03 GMT+01:00 Rich Shepard :

> On Wed, 28 Dec 2016, Adrian Klaver wrote:
>
> An example from my machine that works:
>> aklaver@tito:~/bin> java -jar schemaSpy_5.0.0.jar -t pgsql -s public -u
>> postgres  -db production -host localhost  -dp 
>> /home/aklaver/bin/postgresql-9.4.1212.jre6.jar
>> -o s_spy
>>
>
> Adrian,
>
>   That's interesting. I specified my username, not postgres, since that's
> how I access the databases from the psql CLI. But, since I've resolved the
> issue to my satisfaction and deleted the schemaSpy subdirectory I've no
> incentive to try again.
>
>   Time now to fill them thar tables with fake data so I can test as I
> develop.
>
> Happy new year to you,
>
> Rich
>
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

​Hi I d'like to tell about Sql Power Architect (
http://www.sqlpower.ca/page/architect), an open-source software. Well
explained on the website but notice interessting feature such:
- comparing 2 database structure, and generating stmts to harmonize
differences in SQL or in liquibase ( http://www.liquibase.org/ )
​- infers the implicit FK with names
- multiplateform / multi database (absed on JDBC drivers)
- generate giant pdf for giant models

​Thanks for the link to the CLI shemaSpy, a pity the export is in html and
not reusable stuff like csv


Re: [GENERAL] LYDB: What advice about stored procedures and other server side code?

2016-12-29 Thread Christoph Moench-Tegeder
## Karsten Hilbert (karsten.hilb...@gmx.net):

> > Many applications are not designed to have a "stable" database API.
> It seems OP is arguing they should.

Well, if the environment allows for that, fine. If not, well, duh.

Regards,
Christoph

-- 
Spare Space


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Performance PLV8 vs PLPGSQL

2016-12-29 Thread Pavel Stehule
2016-12-29 10:03 GMT+01:00 Tim Uckun :

> I think it's awesome that postgres allows you to code in different
> languages like this. It really is a unique development environment and one
> that is overlooked as a development platform.  It would be nice if more
> languages were delivered in the default package especially lua, V8 and
> mruby.
>

It is about dependencies and maintenance. There are not too much people who
has good experience with C embedding Lua, V8 and others. Any people who can
do some work are welcome.

The living outside main package has disadvantages - only enthusiast knows
about it, but some advantages too - you are not fixed on PostgreSQL
development cycle, and development can be faster.

Regards

Pavel


>
>
> On Thu, Dec 29, 2016 at 9:31 PM, Chris Travers 
> wrote:
>
>> My recommendation.  See them as tools in a toolkit, not a question of
>> what is best.
>>
>> For places where you have SQL statements as primary do SQL or PLPGSQL
>> functions.
>>
>> For places where you are manipulating values (parsing strings for
>> example) use something else (I usually use pl/perl for string manipulation
>> but ymmv).
>>
>> PLPGSQL works best where you have a large query and some procedurally
>> supporting logic.  It becomes a lot less usable, performant, and
>> maintainable the further you get away from that.
>>
>> So there is no best just different tools in a toolkit.
>>
>
>


Re: [GENERAL] Performance PLV8 vs PLPGSQL

2016-12-29 Thread Pavel Stehule
2016-12-29 10:04 GMT+01:00 Tim Uckun :

> Mostly generating SQL statements to execute. Like for example deciding
> which partition to insert into.
>

Then you don't find any possible performance difference - the query is
about 10-100x slower than expression  - so the plpgsql should be good.

More you can use a "format" function - implemented in C.

Regards

Pavel




>
> On Thu, Dec 29, 2016 at 10:00 PM, Pavel Stehule 
> wrote:
>
>>
>>
>> 2016-12-29 9:23 GMT+01:00 Tim Uckun :
>>
>>> I am not doubting the efficacy of stored procs, just wondering which
>>> language is better. From the sound of it string manupilation is slow in
>>> PL-PGSQL but looking at my procs there does seem to be a lot of string
>>> manipulation going on so maybe I better do some tests.
>>>
>>
>> It is interesting, what string operations you are doing in stored
>> procedures?
>>
>> Regards
>>
>> Pavel
>>
>>
>>>
>>>
>>> On Thu, Dec 29, 2016 at 3:02 AM, Mike Sofen  wrote:
>>>
 *From:* Tim Uckun
 I have seen various links on the internet which indicate that PLV8 is
 significantly faster than PL-PGSQL sometimes an order of magnitude faster.



 Is there any benefit to choosing PL-PGSQL?

 

 I can’t speak to PLV8.  However, I can speak to plpgsql, and
 specifically stored functions (procs).  I use it exclusively to create a
 database API for real-time web applications to hit.  My API calls (procs)
 are hitting large tables, sometimes doing complex logic within the sproc.
 It allows me to provide a simple, standardized interface to the web devs,
 allowing them to focus on the app code work.



 Performance is superb and continues to surprise me (I came from the SQL
 Server world).  As others have mentioned, the natural lashup of plpgsql to
 postgres (I liked Alban’s term, “impedance”), is a key aspect.  Also:



 -stored procs provide another security layer against sql
 injection attacks.

 -Caching SEEMS to be more efficient/effective with stored
 procs (that could be wishful thinking too).

 -Stored procs allow skilled sql practitioners to provide far
 more sophisticated sql solutions than the typical python developer is
 capable of…my experience is that most web devs don’t really understand
 databases (or even care about them – they are a necessary evil), so
 providing a pure encapsulated sql solution (via stored procs) removes that
 mental impedance mismatch.

 -Performance?  Simple “get” procs that return data for a
 specific indexed query against larger tables (50m+ rows) in a few
 milliseconds…I can live with that kind of performance.

 -I’m also doing some heavy lifting in the sql, calculating
 histograms and boxplots for data visualizations.  This is an unusual
 scenario, but the other option is sending a massive chunk of data to
 another server for processing – just the transit time would kill the deal.
 I am mindful that at a certain point, there won’t be enough memory and i/o
 to go around, but the web app is a low user count/high user task complexity
 app, so I’ve tailored the model to match.



 Mike Sofen  (Synthetic Genomics)

>>>
>>>
>>
>


Re: [GENERAL] Performance PLV8 vs PLPGSQL

2016-12-29 Thread Tim Uckun
Mostly generating SQL statements to execute. Like for example deciding
which partition to insert into.

On Thu, Dec 29, 2016 at 10:00 PM, Pavel Stehule 
wrote:

>
>
> 2016-12-29 9:23 GMT+01:00 Tim Uckun :
>
>> I am not doubting the efficacy of stored procs, just wondering which
>> language is better. From the sound of it string manupilation is slow in
>> PL-PGSQL but looking at my procs there does seem to be a lot of string
>> manipulation going on so maybe I better do some tests.
>>
>
> It is interesting, what string operations you are doing in stored
> procedures?
>
> Regards
>
> Pavel
>
>
>>
>>
>> On Thu, Dec 29, 2016 at 3:02 AM, Mike Sofen  wrote:
>>
>>> *From:* Tim Uckun
>>> I have seen various links on the internet which indicate that PLV8 is
>>> significantly faster than PL-PGSQL sometimes an order of magnitude faster.
>>>
>>>
>>>
>>> Is there any benefit to choosing PL-PGSQL?
>>>
>>> 
>>>
>>> I can’t speak to PLV8.  However, I can speak to plpgsql, and
>>> specifically stored functions (procs).  I use it exclusively to create a
>>> database API for real-time web applications to hit.  My API calls (procs)
>>> are hitting large tables, sometimes doing complex logic within the sproc.
>>> It allows me to provide a simple, standardized interface to the web devs,
>>> allowing them to focus on the app code work.
>>>
>>>
>>>
>>> Performance is superb and continues to surprise me (I came from the SQL
>>> Server world).  As others have mentioned, the natural lashup of plpgsql to
>>> postgres (I liked Alban’s term, “impedance”), is a key aspect.  Also:
>>>
>>>
>>>
>>> -stored procs provide another security layer against sql
>>> injection attacks.
>>>
>>> -Caching SEEMS to be more efficient/effective with stored procs
>>> (that could be wishful thinking too).
>>>
>>> -Stored procs allow skilled sql practitioners to provide far
>>> more sophisticated sql solutions than the typical python developer is
>>> capable of…my experience is that most web devs don’t really understand
>>> databases (or even care about them – they are a necessary evil), so
>>> providing a pure encapsulated sql solution (via stored procs) removes that
>>> mental impedance mismatch.
>>>
>>> -Performance?  Simple “get” procs that return data for a
>>> specific indexed query against larger tables (50m+ rows) in a few
>>> milliseconds…I can live with that kind of performance.
>>>
>>> -I’m also doing some heavy lifting in the sql, calculating
>>> histograms and boxplots for data visualizations.  This is an unusual
>>> scenario, but the other option is sending a massive chunk of data to
>>> another server for processing – just the transit time would kill the deal.
>>> I am mindful that at a certain point, there won’t be enough memory and i/o
>>> to go around, but the web app is a low user count/high user task complexity
>>> app, so I’ve tailored the model to match.
>>>
>>>
>>>
>>> Mike Sofen  (Synthetic Genomics)
>>>
>>
>>
>


Re: [GENERAL] Performance PLV8 vs PLPGSQL

2016-12-29 Thread Tim Uckun
I think it's awesome that postgres allows you to code in different
languages like this. It really is a unique development environment and one
that is overlooked as a development platform.  It would be nice if more
languages were delivered in the default package especially lua, V8 and
mruby.



On Thu, Dec 29, 2016 at 9:31 PM, Chris Travers 
wrote:

> My recommendation.  See them as tools in a toolkit, not a question of what
> is best.
>
> For places where you have SQL statements as primary do SQL or PLPGSQL
> functions.
>
> For places where you are manipulating values (parsing strings for example)
> use something else (I usually use pl/perl for string manipulation but ymmv).
>
> PLPGSQL works best where you have a large query and some procedurally
> supporting logic.  It becomes a lot less usable, performant, and
> maintainable the further you get away from that.
>
> So there is no best just different tools in a toolkit.
>


Re: [GENERAL] Performance PLV8 vs PLPGSQL

2016-12-29 Thread Pavel Stehule
2016-12-29 9:23 GMT+01:00 Tim Uckun :

> I am not doubting the efficacy of stored procs, just wondering which
> language is better. From the sound of it string manupilation is slow in
> PL-PGSQL but looking at my procs there does seem to be a lot of string
> manipulation going on so maybe I better do some tests.
>

It is interesting, what string operations you are doing in stored
procedures?

Regards

Pavel


>
>
> On Thu, Dec 29, 2016 at 3:02 AM, Mike Sofen  wrote:
>
>> *From:* Tim Uckun
>> I have seen various links on the internet which indicate that PLV8 is
>> significantly faster than PL-PGSQL sometimes an order of magnitude faster.
>>
>>
>>
>> Is there any benefit to choosing PL-PGSQL?
>>
>> 
>>
>> I can’t speak to PLV8.  However, I can speak to plpgsql, and specifically
>> stored functions (procs).  I use it exclusively to create a database API
>> for real-time web applications to hit.  My API calls (procs) are hitting
>> large tables, sometimes doing complex logic within the sproc.  It allows me
>> to provide a simple, standardized interface to the web devs, allowing them
>> to focus on the app code work.
>>
>>
>>
>> Performance is superb and continues to surprise me (I came from the SQL
>> Server world).  As others have mentioned, the natural lashup of plpgsql to
>> postgres (I liked Alban’s term, “impedance”), is a key aspect.  Also:
>>
>>
>>
>> -stored procs provide another security layer against sql
>> injection attacks.
>>
>> -Caching SEEMS to be more efficient/effective with stored procs
>> (that could be wishful thinking too).
>>
>> -Stored procs allow skilled sql practitioners to provide far
>> more sophisticated sql solutions than the typical python developer is
>> capable of…my experience is that most web devs don’t really understand
>> databases (or even care about them – they are a necessary evil), so
>> providing a pure encapsulated sql solution (via stored procs) removes that
>> mental impedance mismatch.
>>
>> -Performance?  Simple “get” procs that return data for a
>> specific indexed query against larger tables (50m+ rows) in a few
>> milliseconds…I can live with that kind of performance.
>>
>> -I’m also doing some heavy lifting in the sql, calculating
>> histograms and boxplots for data visualizations.  This is an unusual
>> scenario, but the other option is sending a massive chunk of data to
>> another server for processing – just the transit time would kill the deal.
>> I am mindful that at a certain point, there won’t be enough memory and i/o
>> to go around, but the web app is a low user count/high user task complexity
>> app, so I’ve tailored the model to match.
>>
>>
>>
>> Mike Sofen  (Synthetic Genomics)
>>
>
>


Re: [GENERAL] Performance PLV8 vs PLPGSQL

2016-12-29 Thread Chris Travers
My recommendation.  See them as tools in a toolkit, not a question of what
is best.

For places where you have SQL statements as primary do SQL or PLPGSQL
functions.

For places where you are manipulating values (parsing strings for example)
use something else (I usually use pl/perl for string manipulation but ymmv).

PLPGSQL works best where you have a large query and some procedurally
supporting logic.  It becomes a lot less usable, performant, and
maintainable the further you get away from that.

So there is no best just different tools in a toolkit.


Re: [GENERAL] Performance PLV8 vs PLPGSQL

2016-12-29 Thread Tim Uckun
I am not doubting the efficacy of stored procs, just wondering which
language is better. From the sound of it string manupilation is slow in
PL-PGSQL but looking at my procs there does seem to be a lot of string
manipulation going on so maybe I better do some tests.



On Thu, Dec 29, 2016 at 3:02 AM, Mike Sofen  wrote:

> *From:* Tim Uckun
> I have seen various links on the internet which indicate that PLV8 is
> significantly faster than PL-PGSQL sometimes an order of magnitude faster.
>
>
>
> Is there any benefit to choosing PL-PGSQL?
>
> 
>
> I can’t speak to PLV8.  However, I can speak to plpgsql, and specifically
> stored functions (procs).  I use it exclusively to create a database API
> for real-time web applications to hit.  My API calls (procs) are hitting
> large tables, sometimes doing complex logic within the sproc.  It allows me
> to provide a simple, standardized interface to the web devs, allowing them
> to focus on the app code work.
>
>
>
> Performance is superb and continues to surprise me (I came from the SQL
> Server world).  As others have mentioned, the natural lashup of plpgsql to
> postgres (I liked Alban’s term, “impedance”), is a key aspect.  Also:
>
>
>
> -stored procs provide another security layer against sql
> injection attacks.
>
> -Caching SEEMS to be more efficient/effective with stored procs
> (that could be wishful thinking too).
>
> -Stored procs allow skilled sql practitioners to provide far more
> sophisticated sql solutions than the typical python developer is capable
> of…my experience is that most web devs don’t really understand databases
> (or even care about them – they are a necessary evil), so providing a pure
> encapsulated sql solution (via stored procs) removes that mental impedance
> mismatch.
>
> -Performance?  Simple “get” procs that return data for a specific
> indexed query against larger tables (50m+ rows) in a few milliseconds…I can
> live with that kind of performance.
>
> -I’m also doing some heavy lifting in the sql, calculating
> histograms and boxplots for data visualizations.  This is an unusual
> scenario, but the other option is sending a massive chunk of data to
> another server for processing – just the transit time would kill the deal.
> I am mindful that at a certain point, there won’t be enough memory and i/o
> to go around, but the web app is a low user count/high user task complexity
> app, so I’ve tailored the model to match.
>
>
>
> Mike Sofen  (Synthetic Genomics)
>