from:"Rick Otten"

Re: [PERFORM] synchronization between PostgreSQL and Oracle

2017-10-12 Thread Rick Otten

On Thu, Oct 12, 2017 at 5:13 AM, ROS Didier  wrote:

> Hi
>
>I would like your advice  and recommendation about the
> following infrastructure problem :
>
> What is the best way to optimize synchronization between an instance
> PostgreSQL on Windows 7 workstation and an Oracle 11gR2 database on linux
> RHEL  ?
>
> Here are more detailed explanations
>
> In our company we have people who collect data in a 9.6 postgresql
> instance on their workstation that is disconnected from the internet.
>
> In the evening, they connect to the Internet and synchronize the collected
> data to a remote 11gr2 Oracle database.
>
> What is the best performant way to do this ( Oracle_FDW ?, flat files ?, …)
>
>
>
There are several ways to go about this, but for your use case I'd
recommend SymmetricDS -- http://symmetricds.org   (or for the commercial
version:  http://jumpmind.com)

SymmetricDS was originally designed to collect data from cash registers in
a vastly distributed set of small databases and aggregate those results
back into both regional and national data warehouses.  It also pushed data
the other way - when pricing was updated at corporate headquarters, the
data was pushed back into the cash registers.  It works with a wide variety
of database technologies, scales well, and has many synchronization
options.  It is also being used by some organizations these days to
synchronize small databases on IOS and Android devices with their parent
databases back at HQ.

I first used it to implement an Oracle to PostgreSQL data migration that
had to be done without down time.   I've used it successfully for real time
data pushes from MySQL and PG OLTP systems into an Oracle DataMart.   I
also used to use it for PostgreSQL bidirectional replication before other
tools became easier to use.  Because of its great flexibility, SymmetricDS
has a ton of knobs to turn and buttons and configuration options and may
take a bit to get it working optimally.   If you are short on time to
implement a solution, I'd suggest going with the commercial version.

[PERFORM] max partitions behind a view?

2017-09-18 Thread Rick Otten

I use materialized views to cache results from a foreign data wrapper to a
high latency, fairly large (cloud) Hadoop instance.  In order to boost
refresh times I split the FDW and materialized views up into partitions.

Note:  I can't use pg_partman or native partitioning because those don't
really work with this architecture - they are designed for "real" tables.
I can't really use citus because it isn't FDW/matview aware at this time
either.

I then join the various materialized views together with a regular view
made up of a bunch of 'union all' statements.

I have a set of functions which automatically create the new partitions and
then replace the top level view to add them in on the fly.  At this time I
probably have about 60 partitions.

With that approach I can refresh individual chunks of data, or I can
refresh several chunks in parallel.  Generally this has been working pretty
well.  One side effect is that because this is not a real partition, the
planner does have to check each partition whenever I run a query to see if
it has the data I need.  With appropriate indexes, this is ok, checking the
partitions that don't have the data is very quick.  It does make for some
long explain outputs though.

The challenge is that because of an exponential rate of data growth, I
might have to significantly increase the number of partitions I'm working
with - to several hundred at a minimum and potentially more than 1000...

This leads me to the question how many 'union all' statements can I have in
one view?   Should I create a hierarchy of views to gradually roll the data
up instead of putting them all in one top-level view?

Re: [PERFORM] performance problem on big tables

2017-08-14 Thread Rick Otten

Moving that many gigs of data across your network could also take a long
time simply depending on your network configuration.  Before spending a
huge amount of energy tuning postgresql, I'd probably look at how long it
takes to simply copy 20 or 30 G of data between the two machines.



> El 14 ago 2017, a las 15:24, Mariel Cherkassky <
> mariel.cherkas...@gmail.com> escribió:
>
> I have performance issues with two big tables. Those tables are located on
> an oracle remote database. I'm running the quert : insert into
> local_postgresql_table select * from oracle_remote_table.
>
> The first table has 45M records and its size is 23G. The import of the
> data from the oracle remote database is taking 1 hour and 38 minutes. After
> that I create 13 regular indexes on the table and it takes 10 minutes per
> table ->2 hours and 10 minutes in total.
>
> The second table has 29M records and its size is 26G. The import of the
> data from the oracle remote database is taking 2 hours and 30 minutes. The
> creation of the indexes takes 1 hours and 30 minutes (some are indexes on
> one column and the creation takes 5 min and some are indexes on multiples
> column and it takes 11 min.
>
>
>

Re: [PERFORM] Very poor read performance, query independent

2017-07-12 Thread Rick Otten

On Wed, Jul 12, 2017 at 9:38 AM, Charles Nadeau <charles.nad...@gmail.com>
wrote:

> Rick,
>
> Should the number of page should always be correlated to the VmPeak of the
> postmaster or could it be set to reflect shared_buffer or another setting?
> Thanks!
>
>
The documentation implies that you may need to adjust its size when you
change shared_buffer settings.

I usually check it every now and then (I haven't build a formal monitor
yet.) to see if all of the huge pages are free/used and if it looks like
they are all getting consumed - consider bumping it higher.  If there are
lots free, you are probably fine.

cat /proc/meminfo | grep -i "^huge"

--

Also regarding my note on effective_io_concurrency, which I'm not sure you
tried tweaking yet.

With file system and hardware caching between you and your spindles, your
best setting for effective_io_concurrency may be much higher than the
actual number of spindles.   It is worth experimenting with.   If you can,
try several values.  You can use pg_bench to put consistent workloads on
your database for measurement purposes.

Charles
>
> On Mon, Jul 10, 2017 at 5:25 PM, Rick Otten <rottenwindf...@gmail.com>
> wrote:
>
>> Although probably not the root cause, at the least I would set up
>> hugepages  ( https://www.postgresql.org/docs/9.6/static/kernel-resourc
>> es.html#LINUX-HUGE-PAGES ), and bump effective_io_concurrency up quite a
>> bit as well (256 ?).
>>
>>

Re: [PERFORM] Very poor read performance, query independent

2017-07-10 Thread Rick Otten

Although probably not the root cause, at the least I would set up hugepages
 (
https://www.postgresql.org/docs/9.6/static/kernel-resources.html#LINUX-HUGE-PAGES
), and bump effective_io_concurrency up quite a bit as well (256 ?).


On Mon, Jul 10, 2017 at 10:03 AM, Charles Nadeau 
wrote:

> I’m running PostgreSQL 9.6.3 on Ubuntu 16.10 (kernel 4.4.0-85-generic).
> Hardware is:
>
> *2x Intel Xeon E5550
>
> *72GB RAM
>
> *Hardware RAID10 (4 x 146GB SAS 10k) P410i controller with 1GB FBWC (80%
> read/20% write) for Postgresql data only:
>
> Logical Drive: 3
>
> Size: 273.4 GB
>
> Fault Tolerance: 1+0
>
> Heads: 255
>
> Sectors Per Track: 32
>
> Cylinders: 65535
>
> Strip Size: 128 KB
>
> Full Stripe Size: 256 KB
>
> Status: OK
>
> Caching: Enabled
>
> Unique Identifier: 600508B1001037383941424344450A00
>
> Disk Name: /dev/sdc
>
> Mount Points: /mnt/data 273.4 GB
>
> OS Status: LOCKED
>
> Logical Drive Label: A00A194750123456789ABCDE516F
>
> Mirror Group 0:
>
> physicaldrive 2I:1:5 (port 2I:box 1:bay 5, SAS, 146 GB, OK)
>
> physicaldrive 2I:1:6 (port 2I:box 1:bay 6, SAS, 146 GB, OK)
>
> Mirror Group 1:
>
> physicaldrive 2I:1:7 (port 2I:box 1:bay 7, SAS, 146 GB, OK)
>
> physicaldrive 2I:1:8 (port 2I:box 1:bay 8, SAS, 146 GB, OK)
>
> Drive Type: Data
>
> Formatted with ext4 with: sudo mkfs.ext4 -E stride=32,stripe_width=64 -v
> /dev/sdc1.
>
> Mounted in /etc/fstab with this line: 
> "UUID=99fef4ae-51dc-4365-9210-0b153b1cbbd0
> /mnt/data ext4 rw,nodiratime,user_xattr,noatime,nobarrier,errors=remount-ro
> 0 1"
>
> Postgresql is the only application running on this server.
>
>
> Postgresql is used as a mini data warehouse to generate reports and do
> statistical analysis. It is used by at most 2 users and fresh data is added
> every 10 days. The database has 16 tables: one is 224GB big and the rest
> are between 16kB and 470MB big.
>
>
> My configuration is:
>
>
> name | current_setting | source
>
> -+--
> --+--
>
> application_name | psql | client
>
> autovacuum_vacuum_scale_factor | 0 | configuration file
>
> autovacuum_vacuum_threshold | 2000 | configuration file
>
> checkpoint_completion_target | 0.9 | configuration file
>
> checkpoint_timeout | 30min | configuration file
>
> client_encoding | UTF8 | client
>
> client_min_messages | log | configuration file
>
> cluster_name | 9.6/main | configuration file
>
> cpu_index_tuple_cost | 0.001 | configuration file
>
> cpu_operator_cost | 0.0005 | configuration file
>
> cpu_tuple_cost | 0.003 | configuration file
>
> DateStyle | ISO, YMD | configuration file
>
> default_statistics_target | 100 | configuration file
>
> default_text_search_config | pg_catalog.english | configuration file
>
> dynamic_shared_memory_type | posix | configuration file
>
> effective_cache_size | 22GB | configuration file
>
> effective_io_concurrency | 4 | configuration file
>
> external_pid_file | /var/run/postgresql/9.6-main.pid | configuration file
>
> lc_messages | C | configuration file
>
> lc_monetary | en_CA.UTF-8 | configuration file
>
> lc_numeric | en_CA.UTF-8 | configuration file
>
> lc_time | en_CA.UTF-8 | configuration file
>
> listen_addresses | * | configuration file
>
> lock_timeout | 100s | configuration file
>
> log_autovacuum_min_duration | 0 | configuration file
>
> log_checkpoints | on | configuration file
>
> log_connections | on | configuration file
>
> log_destination | csvlog | configuration file
>
> log_directory | /mnt/bigzilla/data/toburn/hp/postgresql/pg_log |
> configuration file
>
> log_disconnections | on | configuration file
>
> log_error_verbosity | default | configuration file
>
> log_file_mode | 0600 | configuration file
>
> log_filename | postgresql-%Y-%m-%d_%H%M%S.log | configuration file
>
> log_line_prefix | user=%u,db=%d,app=%aclient=%h | configuration file
>
> log_lock_waits | on | configuration file
>
> log_min_duration_statement | 0 | configuration file
>
> log_min_error_statement | debug1 | configuration file
>
> log_min_messages | debug1 | configuration file
>
> log_rotation_size | 1GB | configuration file
>
> log_temp_files | 0 | configuration file
>
> log_timezone | localtime | configuration file
>
> logging_collector | on | configuration file
>
> maintenance_work_mem | 3GB | configuration file
>
> max_connections | 10 | configuration file
>
> max_locks_per_transaction | 256 | configuration file
>
> max_parallel_workers_per_gather | 14 | configuration file
>
> max_stack_depth | 2MB | environment variable
>
> max_wal_size | 4GB | configuration file
>
> max_worker_processes | 14 | configuration file
>
> min_wal_size | 2GB | configuration file
>
> parallel_setup_cost | 1000 | configuration file
>
> parallel_tuple_cost | 0.012 | configuration file
>
> port | 5432 | configuration file
>
> random_page_cost | 22 | configuration file
>
> seq_page_cost | 1 | configuration file
>
> shared_buffers | 34GB | configuration file
>
>

Re: [PERFORM] partitioning materialized views

2017-07-06 Thread Rick Otten

>
>
> If you _can't_ do
>> that due to cloud restrictions, you'd actually be better off doing an
>> atomic swap.
>>
>> CREATE MATERIALIZED VIEW y AS ...;
>>
>> BEGIN;
>> ALTER MATERIALIZED VIEW x RENAME TO x_old;
>> ALTER MATERIALIZED VIEW y RENAME TO x;
>> DROP MATERIALIZED VIEW x_old;
>> COMMIT;
>>
>> This is an interesting idea.  Thanks!  I'll ponder that one.
>
>
I don't think the downstream dependencies will let that work without
rebuilding them as well.   The drop fails (without a cascade), and the
other views and matviews that are built off of this all simply point to
x_old.

Re: [PERFORM] partitioning materialized views

2017-07-06 Thread Rick Otten

On Thu, Jul 6, 2017 at 11:25 AM, Shaun Thomas 
wrote:

> > I'm curious if I'm overlooking other possible architectures or tools
> that might make this simpler to manage.
>
> One of the issues with materialized views is that they are based on
> views... For a concurrent update, it essentially performs a looped
> merge, which can be pretty ugly. That's the price you pay to be
> non-blocking. For this particular setup, I'd actually recommend using
> something like pglogical to just maintain a live copy of the remote
> table or wait for Postgres 10's logical replication.


Unfortunately the foreign database is Hadoop.  (As A Service)



> If you _can't_ do
> that due to cloud restrictions, you'd actually be better off doing an
> atomic swap.
>
> CREATE MATERIALIZED VIEW y AS ...;
>
> BEGIN;
> ALTER MATERIALIZED VIEW x RENAME TO x_old;
> ALTER MATERIALIZED VIEW y RENAME TO x;
> DROP MATERIALIZED VIEW x_old;
> COMMIT;
>
> This is an interesting idea.  Thanks!  I'll ponder that one.



> You could still follow your partitioned plan if you don't want to
> update all of the data at once. Let's face it, 3-4 hours is still a
> ton of data transfer and calculation.
>
>
yup.

[PERFORM] partitioning materialized views

2017-07-06 Thread Rick Otten

I'm pondering approaches to partitioning large materialized views and was
hoping for some feedback and thoughts on it from the [perform] minds.

PostgreSQL 9.6.3 on Ubuntu 16.04 in the Google Cloud.

I have a foreign table with 250M or so rows and 50 or so columns, with a
UUID as the primary key.  Queries to the foreign table have high latency.
 (From several minutes to more than an hour to run)

If I create a materialized view of this FT, including indexes, it takes
about 3-4 hours.

If I refresh the materialized view concurrently, it takes 4-5 DAYS.

When I run "refresh materialized view concurrently", it takes about an hour
for it to download the 250M rows and load them onto the SSD tempspace.   At
that point we flatline a single core, and run I/O on the main tablespace up
pretty high, and then stay that way until the refresh is complete.

In order to speed up the concurrent refreshes, I have it broken into 4
materialized views, manually partitioned (by date) with a "union all view"
in front of them.  Refreshing the data which is changing regularly (new
data, in one of the partitions) doesn't require refreshing the entire data
set.  This works fairly well, and I can refresh the most recent partition
in 1 - 2 hours (daily).

However, sometimes I have to reach back in time and refresh the deeper
partitions.  This is taking 3 or more days to complete, even with the data
broken into 4 materialized views.  This approache lets me refresh all of
the partitions at the same time, which uses more cores at the same time
(and more tempspace),   [I'd like to use as much of my system resources as
possible to get the refresh to finish faster.]   Unfortunately I am finding
I need to refresh the deeper data more and more often (at least once per
week), and my table growth is going to jump from adding 3-5M rows per day
to adding 10-20M rows per day over the next month or two.  Waiting 3 or 4
days for the deeper data to be ready for consumption in PostgreSQL is no
longer acceptable to the business.

It doesn't look like partman supports partitioning materialized views.  It
also doesn't look like PG 10's new partitioning features will work with
materialized views (although I haven't tried it yet).   Citus DB also seems
to be unable to help in this scenario.

I could create new materialized views every time I need new data, and then
swap out the view that is in front of them.  There are other objects in the
database which have dependencies on that view. In my experiments so far,
"create and replace" seems to let me get away with this as long as the
columns don't change.

Alternatively, I could come up with a new partitioning scheme that lets me
more selectively run "refresh concurrently", and run more of those at the
same time.

I was leaning towards this latter solution.

Suppose I make a separate materialized view for each month of data.  At the
beginning of each month I would have to make a new materialized view, and
then add it into the "union all view" on the fly.

I would then need a "refresh all" script which refreshed as many of them
concurrently as I am willing to dedicate cores to.  And I need some handy
ways to selectively refresh specific months when I know data for a
particular month or set of months changed.

So, I actually have 2 of these 250M row tables in the Foreign Database,
that I want to do this with.  And maybe more coming soon?

I'm curious if I'm overlooking other possible architectures or tools that
might make this simpler to manage.


Similarly, could I construct the "union all view" in front of the
partitions to be partition aware so that the query planner doesn't try to
look in every one of the materialized views behind it to find the rows I
want?   If I go with the monthly partition, I'll start with about 36
materialized views behind the main view.

Re: [PERFORM] Client Server performance & UDS

2017-05-30 Thread Rick Otten

Establishing a connection with a PostgreSQL database is a very expensive
process on the database server.   On the other hand, establishing a
connection with pgbouncer is very fast.Offloading the connection
management to pgbouncer can significantly reduce the connection set up time.

I've found it to help even with applications that have built-in connection
pooling.

If your clients are keeping persistent connections open to the database,
and the latency you are experiencing is within the transaction itself, you
might look at disk I/O for your WAL (write ahead logs) and take a closer
look at WAL and checkpoint tuning.


On Tue, May 30, 2017 at 3:34 AM, kevin.hug...@uk.fujitsu.com <
kevin.hug...@uk.fujitsu.com> wrote:

> Hi Rick thanks for the reply.
>
>
>
> Our aim is to minimise latency hence we have a dedicated 1:1 relationship
> between the client and the server. If I use connection pooling surely this
> introduced latency – getting a server from the pool establishing the
> connection?
>
>
>
> Am I missing something?
>
>
>
>
>
> *From:* Rick Otten [mailto:rottenwindf...@gmail.com]
> *Sent:* 27 May 2017 13:27
> *To:* Hughes, Kevin <kevin.hug...@uk.fujitsu.com>
> *Cc:* pgsql-performa. <pgsql-performance@postgresql.org>
> *Subject:* Re: [PERFORM] Client Server performance & UDS
>
>
>
> You should have a layer such as pgbouncer between your pg instance and
> your application.  It is designed to mitigate the access latency issues you
> describe.
>
>
>
> On May 26, 2017 10:03 AM, "kevin.hug...@uk.fujitsu.com" <
> kevin.hug...@uk.fujitsu.com> wrote:
>
> Hi,
>
>
>
> This is a general question around this performance area
> rather than a specific performance problem.so I apologise now for a
> lack of a specific detail.
>
>
>
> We have an application that does many small actions on the
> DB – and it’s a small DB (a 50/100 Mbytes) so we would expect it to be
> contained in memory. Accesses need to be low latency – unfortunately there
> are “serial” accesses where the result of one access governs the next.
> Luckily the  work to be done by the DB is, we believe,  very simple and
> hence fast. Everything is running on one (large) server so we use UDS to
> connect the client to the server.
>
>
>
> Out observation (suspicion) is that the latency of  the access, as opposed
> to the cost of the query, is high. Having done some investigation  we
> believe the UDS latency may be contributing AND the cost imposed by
> postgres in “formatting” the messages between the client and server
> (transformation to network format?).
>
>
>
> We will try and get underneath this with real results/measurements but I
> would appreciate any comments pointers on what we are doing and how/if we
> can optimise this style of applications
>
>
>
>
>
> Cheers
>
>
>
>
>
>
>
>
>
>
> Unless otherwise stated, this email has been sent from Fujitsu Services
> Limited (registered in England No 96056); Fujitsu EMEA PLC (registered in
> England No 2216100) both with registered offices at: 22 Baker Street,
> London W1U 3BW; PFU (EMEA) Limited, (registered in England No 1578652) and
> Fujitsu Laboratories of Europe Limited (registered in England No. 4153469)
> both with registered offices at: Hayes Park Central, Hayes End Road, Hayes,
> Middlesex, UB4 8FE.
> This email is only for the use of its intended recipient. Its contents are
> subject to a duty of confidence and may be privileged. Fujitsu does not
> guarantee that this email has not been intercepted and amended or that it
> is virus-free.
>
>
> Unless otherwise stated, this email has been sent from Fujitsu Services
> Limited (registered in England No 96056); Fujitsu EMEA PLC (registered in
> England No 2216100) both with registered offices at: 22 Baker Street,
> London W1U 3BW; PFU (EMEA) Limited, (registered in England No 1578652) and
> Fujitsu Laboratories of Europe Limited (registered in England No. 4153469)
> both with registered offices at: Hayes Park Central, Hayes End Road, Hayes,
> Middlesex, UB4 8FE.
> This email is only for the use of its intended recipient. Its contents are
> subject to a duty of confidence and may be privileged. Fujitsu does not
> guarantee that this email has not been intercepted and amended or that it
> is virus-free.
>

Re: [PERFORM] Client Server performance & UDS

2017-05-27 Thread Rick Otten

You should have a layer such as pgbouncer between your pg instance and your
application.  It is designed to mitigate the access latency issues you
describe.

On May 26, 2017 10:03 AM, "kevin.hug...@uk.fujitsu.com" <
kevin.hug...@uk.fujitsu.com> wrote:

> Hi,
>
>
>
> This is a general question around this performance area
> rather than a specific performance problem.so I apologise now for a
> lack of a specific detail.
>
>
>
> We have an application that does many small actions on the
> DB – and it’s a small DB (a 50/100 Mbytes) so we would expect it to be
> contained in memory. Accesses need to be low latency – unfortunately there
> are “serial” accesses where the result of one access governs the next.
> Luckily the  work to be done by the DB is, we believe,  very simple and
> hence fast. Everything is running on one (large) server so we use UDS to
> connect the client to the server.
>
>
>
> Out observation (suspicion) is that the latency of  the access, as opposed
> to the cost of the query, is high. Having done some investigation  we
> believe the UDS latency may be contributing AND the cost imposed by
> postgres in “formatting” the messages between the client and server
> (transformation to network format?).
>
>
>
> We will try and get underneath this with real results/measurements but I
> would appreciate any comments pointers on what we are doing and how/if we
> can optimise this style of applications
>
>
>
>
>
> Cheers
>
>
>
>
>
>
>
>
>
> Unless otherwise stated, this email has been sent from Fujitsu Services
> Limited (registered in England No 96056); Fujitsu EMEA PLC (registered in
> England No 2216100) both with registered offices at: 22 Baker Street,
> London W1U 3BW; PFU (EMEA) Limited, (registered in England No 1578652) and
> Fujitsu Laboratories of Europe Limited (registered in England No. 4153469)
> both with registered offices at: Hayes Park Central, Hayes End Road, Hayes,
> Middlesex, UB4 8FE.
> This email is only for the use of its intended recipient. Its contents are
> subject to a duty of confidence and may be privileged. Fujitsu does not
> guarantee that this email has not been intercepted and amended or that it
> is virus-free.
>

Re: [PERFORM] Monitoring tool for Postgres Database

2017-05-26 Thread Rick Otten

On Thu, May 25, 2017 at 3:48 PM, Ravi Tammineni <
rtammin...@partner.aligntech.com> wrote:

> Hi,
>
>
>
> What is the best monitoring tool for Postgres database? Something like
> Oracle Enterprise Manager.
>
>
>
> Specifically I am interested in tools to help:
>
>
>
> Alert DBAs to problems with both configuration and performance issues
>
> Deadlocks, Long running queries etc.,
>
> Monitoring of overall system performance
>
> General performance tuning
>
> Storage/Disk latencies
>
>
>
>
>
> Thanks
>
> ravi
>

We use Datadog.  Their PostgreSQL plugin covers most of the most relevant
stats.  It is easy to configure and not very expensive at all.  They have
an easy GUI based configuration for monitors and alerts, and you can link
it with something like Victorops and Slack for additional pager escalation
policies.  We have all of our batch processing tied into Datadog as well,
so we can get a picture of events, systems, and database internals all in
one dashboard.

Re: [PERFORM] Log update query along with username who has executed the same.

2017-05-23 Thread Rick Otten

You need to include "%u" in the log_line_prefix setting in your
postgresql.conf.  Like this:

log_line_prefix = '%m %d %h %u '
>
> #log_line_prefix = ''   # special values:
>
> #   %a = application name
>
> #   %u = user name
>
> #   %d = database name
>
> #   %r = remote host and port
>
> #   %h = remote host
>
> #   %p = process ID
>
> #   %t = timestamp without milliseconds
>
> #   %m = timestamp with milliseconds
>
> #   %n = timestamp with milliseconds (as a Unix epoch)
>
> #   %i = command tag
>
> #   %e = SQL state
>
> #   %c = session ID
>
> #   %l = session line number
>
> #   %s = session start timestamp
>
> #   %v = virtual transaction ID
>
> #   %x = transaction ID (0 if none)
>
> #   %q = stop here in non-session
>
> #processes
>
> #   %% = '%'
>
> # e.g. '<%u%%%d> '
>
>
>
Also 9.1 is pretty old.  You should think about upgrading as soon as is
practical.


On Tue, May 23, 2017 at 8:42 AM, Dinesh Chandra 12108 <
dinesh.chan...@cyient.com> wrote:

> Dear Expert,
>
>
>
> May you please provide the solution for below query.
>
>
>
> I have to create a log for all the update query executed in database along
> with its username who has executed that query.
>
> However, I am able to log all the update queries in my pg_log file but
> it’s not showing particular user who has run the query.
>
>
>
> I am using PostgreSQL 9.1 with Linux platform.
>
>
>
> *Regards,*
>
> *Dinesh Chandra*
>
> *|Database administrator (Oracle/PostgreSQL)| Cyient Ltd. Noida.*
>
> *--*
>
> Mobile: +91-9953975849 <+91%2099539%2075849> | Ext 1078
> |dinesh.chan...@cyient.com
>
> Plot No. 7, NSEZ, Phase-II ,Noida-Dadri Road, Noida - 201 305,India.
>
>
>

Re: [PERFORM] pg_stat_statements with fetch

2017-05-21 Thread Rick Otten

Would turning on logging of temp files help?  That often reports the query
that is using the temp files:
log_temp_files = 0

It probably wouldn't help if the cursor query never pulls from a temp file,
but if it does ...

On Fri, May 19, 2017 at 7:04 PM, Jeff Janes  wrote:

> I'm spoiled by using pg_stat_statements to find the hotspot queries which
> could use some attention.
>
> But with some recent work, all of the hotspots are of the form "FETCH 1000
> FROM c3".  The vast majority of the queries return less than 1000 rows, so
> only one fetch is issued per execution.
>
> Is there an automated way to trace these back to the parent query, without
> having to strong-arm the driving application into changing its cursor-using
> ways?
>
> pg_stat_statements v1.4 and postgresql v9.6 (or 10beta1, if it makes a
> difference)
>
> Sometimes you can catch the DECLARE also being in pg_stat_statements, but
> it is not a sure thing and there is some risk the name got freed and reused.
>
> log_min_duration_statement has the same issue.
>
> Cheers,
>
> Jeff
>

Re: [PERFORM] Impact of track_activity_query_size on high traffic OLTP system

2017-04-13 Thread Rick Otten

I always bump it up, but usually just to 4096, because I often have queries
that are longer than 1024 and I'd like to be able to see the full query.
I've never seen any significant memory impact.   I suppose if you had
thousands of concurrent queries it would add up, but if you only have a few
dozen, or even a few hundred queries at any given moment - on a modern
system it doesn't seem to impact things very much.

On Thu, Apr 13, 2017 at 4:45 PM, Jeremy Finzel  wrote:

> I have found some examples of people tweaking this
> parameter track_activity_query_size to various setting such as 4000,
> 1, 15000, but little discussion as to performance impact on memory
> usage.  What I don't have a good sense of is how significant this would be
> for a high traffic system with rapid connection creation/destruction, say
> 1000s per second.  In such a case, would there be a reason to hesitate
> raising it to 1 from 1024?  Is 10k memory insignificant?  Any direction
> here is much appreciated, including a good way to benchmark this kind of
> thing.
>
> Thanks!
>

Re: [PERFORM] Filter certain range of IP address.

2017-04-07 Thread Rick Otten

On Fri, Apr 7, 2017 at 11:29 AM, David G. Johnston <
david.g.johns...@gmail.com> wrote:

> On Fri, Apr 7, 2017 at 8:18 AM, Dinesh Chandra 12108 <
> dinesh.chan...@cyient.com> wrote:
>
>> Dear Vinny,
>>
>> Thanks for your valuable replay.
>>
>> but I need a select query, which select only that record which starts
>> from IP "172.23.110" only from below table.
>>
>> xxx 172.23.110.175
>> yyy 172.23.110.178
>> zzz 172.23.110.177
>> aaa 172.23.110.176
>> bbb 172.23.111.180
>> ccc 172.23.115.26
>>
>
> SELECT ... WHERE substring(ip_addr::text, 1, 10) = '172.23.110'
>

or
   select ... where ip_addr << '172.23.110/32';

if ip_addr is an inet data type -- https://www.postgresql.org/
docs/9.6/static/functions-net.html

Re: [PERFORM] Delete, foreign key, index usage

2017-04-05 Thread Rick Otten

On Wed, Apr 5, 2017 at 6:40 AM, Johann Spies  wrote:

> On 4 April 2017 at 14:07, Johann Spies  wrote:
>
> > Why would that be?
>
> To answer my own question.  After experimenting a lot we found that
> 9.6 uses a parallel seqscan that is actually a lot faster than using
> the index on these large tables.
>
> This, to us was a surprise!
>
>
If you have modern GPU's available, you could try the pg-strom extension -
https://github.com/pg-strom/devel
It leverages GPU's to further parallelize scans.

Re: [PERFORM] How Can I check PostgreSQL backup is successfully or not ?

2017-03-02 Thread Rick Otten

This reminds me - I have had a case where the exit code for pg_dump was
successful, but the backup was still corrupted on disk.  By all means check
the exit code, but I strong encourage a second validation, such as the
index listing, to increase your confidence that the backup was successful.

The best way to ensure good backups is to establish a regular practice of
restoring a backup to another database.  The easiest such practice to
justify and implement is to maintain a developer/development database, and
to use your production database backups to rebuild it on a regular basis.
Other approaches could include regularly scheduled Disaster Recovery
exercises, or simply spinning up throw away cloud instances for the purpose.

pg_dump uses the ordinary postgresql COPY command to extract data from the
tables.  Beyond that, I'm not sure how it works.  Sorry I can't help you
there.


On Thu, Mar 2, 2017 at 7:05 AM, Dinesh Chandra 12108 <
dinesh.chan...@cyient.com> wrote:

> Hi,
>
> When I issue the bleow command
>   > ./bin >pg_dump -U dummy_user  dummy_database; echo $?
>
> I checked with Linux TOP command on the same server, it was showing COPY
> database.
> What exactly it doing ??
>
> Regards,
> Dinesh Chandra
>
> -Original Message-
> From: vinny [mailto:vi...@xs4all.nl]
> Sent: 27 February, 2017 7:31 PM
> To: John Gorman <jgor...@eldocomp.com>
> Cc: Rick Otten <rottenwindf...@gmail.com>; Dinesh Chandra 12108 <
> dinesh.chan...@cyient.com>; pgsql-performance@postgresql.org;
> pgsql-performance-ow...@postgresql.org
> Subject: Re: [PERFORM] How Can I check PostgreSQL backup is successfully
> or not ?
>
> On 2017-02-27 14:29, John Gorman wrote:
> > Even though it's not listed in any of the documentation or “pg_dump
> > --help” you can check the return code of the process. A return code
> > greater than 0 (zero) usually indicates a failure
> >
> > ./bin >pg_dump -U dummy_user  dummy_database; echo $?
> >
> > 1
> >
> > FROM: pgsql-performance-ow...@postgresql.org
> > [mailto:pgsql-performance-ow...@postgresql.org] ON BEHALF OF Rick
> > Otten
> > SENT: Monday, February 27, 2017 3:36 AM
> > TO: Dinesh Chandra 12108
> > CC: pgsql-performance@postgresql.org
> > SUBJECT: Re: [PERFORM] How Can I check PostgreSQL backup is
> > successfully or not ?
> >
> > Although it doesn't really tell if the pg_dump was successful (you'll
> > need to do a full restore to be sure), I generate an archive list.  If
> > that fails, the backup clearly wasn't successful, and if it succeeds,
> > odds are pretty good that it worked:
> >
> > On Mon, Feb 27, 2017 at 4:35 AM, Dinesh Chandra 12108
> > <dinesh.chan...@cyient.com> wrote:
> >
> > Hi,
> >
> > We are taking daily full backup of PostgreSQL database using PG_DUMP
> > which is automatic scheduled through Cronjobs.
> >
> > How can I check my yesterday backup is successfully or not?
> >
> > Is there any query or view by which I can check it?
> >
> > REGARDS,
> >
> > DINESH CHANDRA
> >
> > |DATABASE ADMINISTRATOR (ORACLE/POSTGRESQL)| CYIENT LTD. NOIDA.
>
>
> It's important to note the distinction between
>
> "the backup process did not fail"
>
> and
>
> "we now have a trustworthy backup"
>
> And you can go full-paranoia and say that you can successfully create a
> perfectly working backup of the wrong database.
>
> So what is it that you want to make sure of:
> 1. Did the process give an error?
> 2. Did the process create a usable backup?
>
> What are the chances of #1 reporting success but still producing a bad
> backup?
> And can #2 fail on a good database, and if so, can you detect that?
>
>
>
> 
>
> DISCLAIMER:
>
> This email message is for the sole use of the intended recipient(s) and
> may contain confidential and privileged information. Any unauthorized
> review, use, disclosure or distribution is prohibited. If you are not the
> intended recipient, please contact the sender by reply email and destroy
> all copies of the original message. Check all attachments for viruses
> before opening them. All views or opinions presented in this e-mail are
> those of the author and may not reflect the opinion of Cyient or those of
> our affiliates.
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>

Re: [PERFORM] How Can I check PostgreSQL backup is successfully or not ?

2017-02-27 Thread Rick Otten

Although it doesn't really tell if the pg_dump was successful (you'll need
to do a full restore to be sure), I generate an archive list.  If that
fails, the backup clearly wasn't successful, and if it succeeds, odds are
pretty good that it worked:

-- bash code snippet --
archiveList=`pg_restore -l ${backupFolder}`
if [[ ! ${archiveList} =~ "Archive created at" ]]
then
echo "PostgreSQL backup - Archive List Test Failed for
${hostName}:${dbName}"
echo "Archive listing:"
echo ${archiveList}
exit 1
fi
---



On Mon, Feb 27, 2017 at 4:35 AM, Dinesh Chandra 12108 <
dinesh.chan...@cyient.com> wrote:

> Hi,
>
>
>
> We are taking daily full backup of PostgreSQL database using *PG_DUMP*
> which is automatic scheduled through Cronjobs.
>
>
>
> How can I check my yesterday backup is successfully or not?
>
> Is there any query or view by which I can check it?
>
>
>
> *Regards,*
>
> *Dinesh Chandra*
>
> *|Database administrator (Oracle/PostgreSQL)| Cyient Ltd. Noida.*
>
> *--*
>
> Mobile: +91-9953975849 <+91%2099539%2075849> | Ext 1078
> |dinesh.chan...@cyient.com
>
> Plot No. 7, NSEZ, Phase-II ,Noida-Dadri Road, Noida - 201 305,India.
>
>
>
> --
>
> DISCLAIMER:
>
> This email message is for the sole use of the intended recipient(s) and
> may contain confidential and privileged information. Any unauthorized
> review, use, disclosure or distribution is prohibited. If you are not the
> intended recipient, please contact the sender by reply email and destroy
> all copies of the original message. Check all attachments for viruses
> before opening them. All views or opinions presented in this e-mail are
> those of the author and may not reflect the opinion of Cyient or those of
> our affiliates.
>

Re: [PERFORM] Backup taking long time !!!

2017-01-24 Thread Rick Otten

Actually, I think this may be the way Oracle Hot Backups work.  It was my
impression that feature temporarily suspends writes into a specific
tablespace so you can take a snapshot of it.  It has been a few years since
I've had to do Oracle work though and I could be mis-remembering.  People
may be confusing Oracle and PostgreSQL.


On Tue, Jan 24, 2017 at 10:55 AM, Jeff Janes  wrote:

> On Mon, Jan 23, 2017 at 9:43 AM, Simon Riggs 
> wrote:
>
>> On 23 January 2017 at 17:12, Jeff Janes  wrote:
>>
>> >> Just to make sure anyone reading the mailing list archives isn't
>> >> confused, running pg_start_backup does *not* make PG stop writing to
>> >> BASEDIR (or DATADIR, or anything, really).  PG *will* continue to write
>> >> data into BASEDIR after pg_start_backup has been called.
>> >
>> >
>> >
>> > Correct.  Unfortunately it is a very common myth that it does cause
>> > PostgreSQL to stop writing to the base dir.
>>
>> Never heard that one before. Wow. Who's been saying that?
>>
>> It's taken me years to hunt down all invalid backup memes and terminate
>> them.
>>
>> Never fails to surprise me how many people don't read the docs.
>>
>
> I've seen it on stackexchange, and a few times on the pgsql mailing lists,
> and talking to people in person.  I've never traced it back some
> "authoritative" source who is making the claim, I think many people just
> independently think up "How would I implement pg_start_backup if I were
> doing it" and then come up with the same false conclusion, and then all
> reinforce each other.
>
> I don't think the docs are particularly clear on this. There is the
> comment "Some file system backup tools emit warnings or errors if the
> files they are trying to copy change while the copy proceeds. When taking a
> base backup of an active database, this situation is normal and not an
> error" but the reader could think that comment could apply to any of the
> files in the datadirectory (in particular, pg_xlog), and could think that
> it doesn't apply to the files in datadirectory/base in particular.  In
> other words, once they form the wrong understanding, the docs (if read)
> don't force them to change it, as they could interpret it in ways that are
> consistent.
>
> Of course the docs aren't a textbook and aren't trying to fully describe
> the theory of operation; just give the people a recipe they can follow. But
> people will make inferences from that recipe anyway.  I don't know if it is
> worth trying preemptively dispel these mistakes in the docs.
>
> Cheers,
>
> Jeff
>
>

Re: [PERFORM] Sort-of replication for reporting purposes

2017-01-06 Thread Rick Otten

I suggest SymmetricDS.  ( http://symmetricds.org )

I've had good luck using them to aggregate data from a heterogeneous suite
of database systems and versions back to a single back-end data mart for
exactly this purpose.



On Fri, Jan 6, 2017 at 2:24 PM, Ivan Voras  wrote:

> Hello,
>
> I'm investigating options for an environment which has about a dozen
> servers and several dozen databases on each, and they occasionally need to
> run huge reports which slow down other services. This is of course "legacy
> code". After some discussion, the idea is to offload these reports to
> separate servers - and that would be fairly straightforward if not for the
> fact that the report code creates temp tables which are not allowed on
> read-only hot standby replicas.
>
> So, the next best thing would be to fiddle with the storage system and
> make lightweight snapshots of live database clusters (their storage
> volumes) and mount them on the reporting servers when needed for the
> reports. This is a bit messy :-)
>
> I'm basically fishing for ideas. Are there any other options available
> which would offer fast replication-like behaviour ?
>
> If not, what practices would minimise problems with the storage snapshots
> idea? Any filesystem options?
>
>

[PERFORM] materialized view order by and clustering

2016-11-17 Thread Rick Otten

If I construct the materialized view with an 'order by', I can use a BRIN
index to a sometimes significant performance advantage, at least for the
primary sort field.  I have observed that even though the first pass is a
little lossy and I get index rechecks, it is still much faster than a
regular btree index.

Does it matter if I also try to CLUSTER the materialized view on that
primary sort field? Or is it already clustered because of the 'order by'?

Would the brin index work better on a clustered materialized view instead
of an ordered materialized view?

When I refresh the materialized view (concurrently) is the order_by
preserved?  Would the clustering be preserved?

I'm trying to get a handle on the concept of clustering and how that is
different than order_by and which would be better and how much advantage it
really gets me.   I'll continue to do experiments with this, but thought
some of the performance gurus on this list would have some immediate
thoughts on the subject off the top of their heads, and others reading this
list might find the observations interesting.

Thank you for your time.

Re: [PERFORM] Perf decreased although server is better

2016-11-04 Thread Rick Otten

> Rick, what did you mean by kernel configuration? The OS is a standard
Ubuntu 16.04:
>
> - Linux 4.4.0-45-generic #66-Ubuntu SMP Wed Oct 19 14:12:37 UTC 2016
x86_64 x86_64 x86_64 GNU/Linux
>
> Do you think losing half the number of cores can explain my performance
issue ? (AMD 8 cores down to Haswell 4 cores).

I was referring to some of the tunables discussed on this page:
https://www.postgresql.org/docs/9.6/static/kernel-resources.html

Specifically, in my environment I update /etc/security/limits.conf to
include:

* hard nofile 65536
* soft nofile 65536

* hard stack 16384
* soft stack 16384

* hard memlock unlimited
* soft memlock unlimited

And then add this to /etc/pam.d/common-session so that they get picked up
when I su to the postgres user:

session requiredpam_limits.so

I update sysctl.conf with huge pages:

vm.hugetlb_shm_group=5432
vm.nr_hugepages=4300

(The number of huge pages may be different for your environment.)
And create and add the postgres user to the huge pages group:

hugepages:x:5432:postgres

You may also want to look at some TCP tunables, and check your shared
memory limits too.

I only mentioned this because sometimes when you move from one system to
another, you can get so caught up in getting the database set up  and data
migration that you overlook the basic system settings...

Regarding the number of cores, most of the postgresql queries are going to
be single threaded.  The number of cores won't impact the performance of a
single query except in certain circumstances:
1) You have parallel queries enabled and the table is doing some sort
of expensive sequence scan
2) You have so many concurrent queries running the whole system is cpu
starved.
3) There is some other resource contention between the cpus that causes
_more_ cpus to actually run slower than fewer.  (It happens - I had a
server back in the 90's which had severe lock contention over /dev/tcp.
Adding more cpus made it slower.)
4) The near-cache memory gets fragmented in a way that processors have
to reach deeper in the caches to find what they need.  (I'm not explaining
that very well, but it is unlikely to be a problem in your case anyway.)

A quick and simple command to get a sense of how busy your cpus are is:

$ mpstat -P ALL 5

(let it run for a few of the 5 second intervals)

If they are all running pretty hot, then more cores might help.  If just
one is running hot, then more cores probably won't do anything.

On Fri, Nov 4, 2016 at 7:53 AM, Benjamin Toueg  wrote:

> I've noticed a network latency increase. Ping between web server and
> database : 0.6 ms avg before, 5.3 ms avg after -- it wasn't that big 4 days
> ago :(
>
> I've narrowed my investigation to one particular "Transaction" in terms of
> the NewRelic APM. It's basically the main HTTP request of my application.
>
> Looks like the ping impacts psycopg2:connect (see http://imgur.com/a/LDH1c):
> 4 ms up to 16 ms on average.
>
> That I can understand. However, I don't understand the performance
> decrease of the select queries on table1 (see https://i.stack.imgur.com/
> QaUqy.png): 80 ms up to 160 ms on average
>
> Same goes for table 2 (see http://imgur.com/a/CnETs): 4 ms up to 20 ms on
> average
>
> However, there is a commit in my request, and it performs better (see
> http://imgur.com/a/td8Dc): 12 ms down to 6 ms on average.
>
> I don't see how this can be due to network latency!
>
> I will provide a new bonnie++ benchmark when the requests per minute is at
> the lowest (remember I can only run benchmarks while the server is in use).
>
> Rick, what did you mean by kernel configuration? The OS is a standard
> Ubuntu 16.04:
>
>  - Linux 4.4.0-45-generic #66-Ubuntu SMP Wed Oct 19 14:12:37 UTC 2016
> x86_64 x86_64 x86_64 GNU/Linux
>
> Do you think losing half the number of cores can explain my performance
> issue ? (AMD 8 cores down to Haswell 4 cores).
>
> Best Regards,
>
> Benjamin
>
> PS : I've edited the SO post http://serverfault.com/
> questions/812702/posgres-perf-decreased-although-server-is-better
>
> 2016-11-04 1:05 GMT+01:00 Kevin Grittner :
>
>> On Thu, Nov 3, 2016 at 9:51 AM, Benjamin Toueg  wrote:
>> >
>> > Stream gives substantially better results with the new server
>> (before/after)
>>
>> Yep, the new server can access RAM at about twice the speed of the old.
>>
>> > I've run "bonnie++ -u postgres -d /tmp/ -s 4096M -r 1096" on both
>> > machines. I don't know how to read bonnie++ results (before/after)
>> > but it looks quite the same, sometimes better for the new,
>> > sometimes better for the old.
>>
>> On most metrics the new machine looks better, but there are a few
>> things that look potentially problematic with the new machine: the
>> new machine uses about 1.57x the CPU time of the old per block
>> written sequentially ((41 / 143557) / (16 / 87991)); so if the box
>> becomes CPU starved, you might notice writes getting slower than

Re: [PERFORM] Perf decreased although server is better

2016-11-02 Thread Rick Otten

How did you migrate from one system to the other?

[ I recently moved a large time series table from 9.5.4 to 9.6.1 using dump
and restore.  Although it put the BRIN index on the time column back on, it
was borked.  Reindexing didn't help.  I had to switch it to a regular btree
index.  I think the data wasn't inserted back into the database in time
order.  Therefore, because it was all over the place, the efficiency gains
from the BRIN index were lost.  It was probably because I restored it with
"-j 8".  -- It is possible something you didn't expect when moving
introduce new inefficiencies.   I also found running pg_repack after the
restore helped performance (and storage size) on my new system too. ]

Do you have all of the kernel settings configured per best practices?
Sometimes it is easy to forget them when you get totally focused on just
moving the data.
(Things such as your hugepages settings)

With 9.6 you can enable parallel queries.  Of course you wouldn't be
comparing apples-to-apples then, but take advantage of that feature if you
can.

On Wed, Nov 2, 2016 at 9:26 AM, Benjamin Toueg  wrote:

> Hi everyone,
>
> I'm facing a peformance decrease after switching to a more performant VPS
> : http://serverfault.com/questions/812702/posgres-perf-
> decreased-although-server-is-better
>
> My questions are:
>
>1. What benchmark should I perform before switching to a new server?
>2. What's your rule of thumb regarding my specific issue? What should
>be investigated first?
>
>
> Best Regards,
>
> Benjamin
>
>

Re: [PERFORM] Millions of tables

2016-09-26 Thread Rick Otten

Are the tables constantly being written to, or is this a mostly read
scenario?   One architecture possibility, if the writes are not so
frequent,  is to create just a handful of very big tables for writing, and
then make smaller tables as materialized views for reading.  The vacuum and
bloat management could be done back a the big tables. The materialized
views could be refreshed or replaced during non-peak hours.  The
materialized views could be on a different tablespace than the root
tables.  They could also be structured to reflect real-world query patterns
which are sometimes different than the raw data storage engineering problem.

With some logging you may be able to see that the data is not truly
randomly accessed, but rather clustered around just some of the millions of
tables.   Then the engineering problem becomes "How do I service 90% of the
queries on these tables in 30ms ?" Rather than "How do I service 100% of
the queries 100% of the time in 30ms?"  Knowing 90% of the queries hit just
a few hundred tables, makes the first question easier to answer.

Similarly, if most of the columns are static and only a few columns are
actually changing, you could consider pulling the static stuff out of the
same table with the dynamic stuff and then look at joins in your queries.
The end goal is to be able to get solid indexes and tables that don't
change a lot so they can be tightly packed and cached.  (less bloat, less
fragmentation, fewer disk accesses).

With regards to consistent query performance, I think you need to get out
of AWS.  That environment is terrible if you are going for consistency
unless you buy dedicated hardware, and then you are paying so much money it
is ridiculous.

Also I think having 10M rows in a table is not a problem for the query
times you are referring to.  So instead of millions of tables, unless I'm
doing my math wrong, you probably only need thousands of tables.

On Mon, Sep 26, 2016 at 5:43 AM, Stuart Bishop 
wrote:

> On 26 September 2016 at 11:19, Greg Spiegelberg 
> wrote:
>
>> I did look at PostgresXL and CitusDB.  Both are admirable however neither
>> could support the need to read a random record consistently under 30ms.
>> It's a similar problem Cassandra and others have: network latency.  At this
>> scale, to provide the ability to access any given record amongst trillions
>> it is imperative to know precisely where it is stored (system & database)
>> and read a relatively small index.  I have other requirements that prohibit
>> use of any technology that is eventually consistent.
>>
>> I liken the problem to fishing.  To find a particular fish of length,
>> size, color  in a data lake you must accept the possibility of scanning
>> the entire lake.  However, if all fish were in barrels where each barrel
>> had a particular kind of fish of specific length, size, color  then the
>> problem is far simpler.
>>
>> -Greg
>>
>
> My gut tells me that if you do solve the problem and get PostgreSQL (or
> anything) reading consistently at under 30ms with that many tables you will
> have solved one problem by creating another.
>
> You discounted Cassandra due to network latency, but are now trying a
> monolithic PostgreSQL setup. It might be worth trying a single node
> ScyllaDB or Cassandra deploy (no need for QUORUM or network overhead),
> perhaps using layered compaction so all your data gets broken out into
> 160MB chunks. And certainly wander over to the ScyllaDB mailing list, as
> they are very focused on performance problems like yours and should offer
> some insight even if a Cassandra style architecture cannot meet your
> requirements.
>
> An alternative if you exhaust or don't trust other options, use a foreign
> data wrapper to access your own custom storage. A single table at the PG
> level, you can shard the data yourself into 8 bazillion separate stores, in
> whatever structure suites your read and write operations (maybe reusing an
> embedded db engine, ordered flat file+log+index, whatever).
>
> --
> Stuart Bishop 
> http://www.stuartbishop.net/
>

Re: [PERFORM] Disk filled-up issue after a lot of inserts and drop schema

2016-09-14 Thread Rick Otten

In Unix/Linux with many of the common file system types, if you delete a
file, but a process still has it open, it will continue to "own" the disk
space until that process closes the file descriptor or dies.  If you try
"ls" or other file system commands, you won't actually see the file there,
yet it really is, and it still has exclusive control of a portion of the
disk.  The file is "unlinked" but the data blocks for the file are still
reserved.

Like 'ls', the 'du' command only looks at files that still exist and adds
up the disk space for those files.  It does not know about these files that
have been unlinked, but still reserve a large portion of the disk.

I don't know why something still has an open file descriptor on something
you believe has been removed, but at least that explains why you are
experiencing the discrepancy between "du" and the real available space on
the disk.


On Wed, Sep 14, 2016 at 9:53 AM, Pietro Pugni 
wrote:

> I’m running PostgreSQL 9.5.4 on a virtual machine for production purposes.
> It runs Ubuntu 16.04.1 LTS 64bit, 32GB RAM, 461GB disk space and 4 x
> logical CPUs.
>
> Postgres executes the following activities:
> - many INSERTS for ETL
> - a lot of read and write operations for the main OLTP application
>
> The ETL job is still under development, so I’m launching several
> sequential “tries” in order to get the whole thing working. The ETL
> procedure consists of a lot of inserts packed inside transactions. At the
> moment each transaction consists of 100k inserts, so for a 90mln rows table
> I get 90mln inserts packed in 900 transactions. I know it’s not the best,
> but JDBC drivers combined with Pentaho doesn’t seem to pack more inserts
> into one, so I get a lot of overhead. I can see INSERT, BIND and PARSE
> called for each insert.. I think it’s Pentaho which embeds the INSERT in a
> parametric query.. I hate Pentaho.. anyway..
>
> The ETL procedure does the following:
> 1) DROP SCHEMA IF EXISTS data_schema CASCADE;
> 2) creates the “data_schema” schema and populates it with tables and rows
> using INSERTs as described before;
> 3) if an error occurs, drop the schema
>
> I’m repeating the previous steps many times because of some Pentaho errors
> which the team is working on in order to get it working. This stresses the
> WAL because the interruption of the process interrupts the current
> transaction and is followed by a DROP SCHEMA .. CASCADE.
>
> *After few days since we began debugging the ETL elaboration, the disk
> filled up and the last ETL job was automatically aborted*. Note that the
> DB data directory is located on the same root disk at
> /var/lib/postgresql/9.5/main
>
> What shocked me was that the *data directory of Postgres was just 815MB*
> in size ($ du -h /var/lib/postgresql/9.5/main ) and pg_xlog was 705MB, *but
> the entire disk was full *("df -h" returned a disk usage of 100%).
>
> I looked for any postgres activity and only noticed a checkpoint writer
> process that was writing at low speeds (IO usage was about 5%).
> Also, "SELECT * FROM pg_stat_activity" returned nothing and the most
> shocking part was that the "du -h /“ command returned 56GB as the total
> size of files stored on the whole disk!!! The same was for “du -ha /“,
> which returns the apparent size.
>
> The total disk size is 461GB, *so how is it possible that “df -h”
> resulted in 461GB occupied while “du -h /“ returned just 56GB?*
>
> After executing:
> $ service postgresql stop
> $ service postgresql start
>
> *the disk was freed and “df -h” returned a usage of just 16%!*
>
> The other questions are:
> - *how can I prevent the disk from filling up? I’m using the default
> configuration for the WAL (1GB max size).*
> - *how can I tune Postgres to speed up the INSERTs?*
>
> The *actual configuration* is the following:
> listen_addresses = 'localhost'
> max_connections = 32
> shared_buffers = 16GB
> work_mem = 128MB
> maintenance_work_mem = 512MB
> effective_io_concurrency = 10
> checkpoint_completion_target = 0.9
> cpu_tuple_cost = 0.02
> cpu_index_tuple_cost = 0.01
> cpu_operator_cost = 0.005
> effective_cache_size = 24GB
> default_statistics_target = 1000
>
> *May be that some of these parameters causes this strange behavior?
> checkpoint_completion_target?*
>
> Thanks to everyone for the support.
>
> Best regards,
>  Pietro Pugni
>
>
>
>

Re: [PERFORM] Performant queries on table with many boolean columns

2016-04-20 Thread Rick Otten

Would a bit string column work? --
http://www.postgresql.org/docs/9.5/static/datatype-bit.html

You might need to use a lot of bitwise OR statements in the query though if
you are looking at very sparse sets of specific values...

Something like the get_bit() function might allow you to select a specific
bit, but then you might want a bunch of functional indexes on the column
for various get_bit() combinations.

Maybe you can group commonly queried sets of columns into bit strings.
 (rather than having one bit string column for all 100 booleans).

On Wed, Apr 20, 2016 at 2:54 PM, Teodor Sigaev  wrote:

>
>> The obvious thing seems to make a table with ~100 columns, with 1 column
>> for each boolean property. Though, what type of indexing strategy would
>> one use on that table? Doesn't make sense to do BTREE. Is there a better
>> way to structure it?
>>
>> looks like a deal for contrib/bloom index in upcoming 9.6 release
>
>
> --
> Teodor Sigaev  E-mail: teo...@sigaev.ru
>   WWW: http://www.sigaev.ru/
>
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>

Re: [PERFORM] Architectural question

2016-03-23 Thread Rick Otten

I have another suggestion.  How about putting the images in RethinkDB?

RethinkDB is easy to set up and manage, and is scalable and easy (almost
trivial) to cluster.  Many of the filesystem disadvantages you mention
would be much more easily managed by RethinkDB.

A while back I wrote a Foreign Data Wrapper for RethinkDB.  I haven't
updated it to the latest version, but it wouldn't be hard to bring it up to
date.  (It might even work as-is.)   By leveraging the FDW, you could have
all of the awesome Relational Power and performance of PostgreSQL combined
with the scalable, easily clustered, NoSQL powers of RethinkDB, yet still
have a common interface - if you need it.



On Wed, Mar 23, 2016 at 8:29 AM, Mike Sofen  wrote:

> > -Original Message-
> > Thomas Kellerer Wednesday, March 23, 2016 2:51 AM
> >
> > Jim Nasby schrieb am 11.03.2016 um 17:37:
> > > If the blob is in the database then you have nothing extra to do. It's
> handled
> > just like all your other data.
> > >
> > > If it's a file in a file system then you need to:
> > >
> > > - Have application code that knows how and where to get at the file
> > > - Have a way to make those files available on all your webservers
> > > - Have completely separate backup and recovery plans for those files
> > >
> > > That's a lot of extra work. Sometimes it's necessary, but many times
> it's not.
> >
> > Don't forget the code you need to write to properly handle transactional
> access
> > (writing, deleting) to the files
> >
> > You usually also need to distribute the files over many directories.
> > Having millions of files in a single directory is usually not such a
> good idea.
> >
> > In my experience you also need some cleanup job that removes orphaned
> files
> > from the file system.
> > Because no matter how hard you try, to get updates/writes to the file
> system
> > right, at some point this fails.
> >
> > Also from a security point of view having this in the database is more
> robust
> > then in the file system.
> >
> > The downside of bytea is that you can't stream them to the client. The
> > application always needs to read the whole blob into memory before it
> can be
> > used. This might put some memory pressure on the application server.
> >
> > Thomas
>
> This is really an excellent conversation, and highlights the never-ending
> contemplation
> of blob storage.  I've had to go through this dialog in two different
> industries - healthcare
> and now genomics, creating a new EMR (electronic medical record) system
> and storing
> and manipulating huge genomic data sets.
>
> I have, in both cases, ended up leaving the blob-type data outside of the
> database.  Even
> though, as Thomas mentioned, it requires more database and app code to
> manage, it
> ends up allowing for both systems to be optimized for their respective
> duties.
>
> In addition, the vastly smaller database sizes result in far faster
> backups and restores,
> transactional replication maintains it's speed, and in general, I find the
> fault tolerant
> behaviors to be excellent.
>
> Yes, losing track of a file would be very bad, and...we're only storing
> things like xray photos
> or ct scans (healthcare), or genomic processing results.  In both cases,
> usually, the results
> can be recreated.  That said, I've never lost a file so haven't needed to
> pull on that lever.
>
> My latest model is placing large genomic data onto the AWS S3 file system,
> keeping all of
> the metadata inside the database.  It's working very well so far, but
> we're still in development.
>
> Mike
>
>
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>

Re: [PERFORM] using shared_buffers during seq_scan

2016-03-19 Thread Rick Otten

There is parallel sequence scanning coming in 9.6 --
http://rhaas.blogspot.com/2015/11/parallel-sequential-scan-is-committed.html

And there is the GPU extension - https://wiki.postgresql.org/wiki/PGStrom

If those aren't options, you'll want your table as much in memory as
possible so your scan doesn't have to to go disk.

On Thu, Mar 17, 2016 at 5:57 AM, Artem Tomyuk  wrote:

> Hi All!
>
> Is  Postgres use shared_buffers during seq_scan?
> In what way i can optimize seq_scan on big tables?
>
> Thanks!
>

Re: [PERFORM] Filesystem and Disk Partitioning for New Server Setup

2016-02-24 Thread Rick Otten

An LVM gives you more options.

Without an LVM you would add a disk to the system, create a tablespace, and
then move some of your tables over to the new disk.  Or, you'd take a full
backup, rebuild your file system, and then restore from backup onto the
newer, larger disk configuration.  Or you'd make softlinks to pg_log or
pg_xlog or something to stick the extra disk in your system somehow.

You can do that with an LVM too.  However, with an LVM you can add the disk
to the system, extend the file system, and just keep running.  Live.  No
need to figure out which tables or files should go where.

Sometimes it is really nice to have that option.




On Wed, Feb 24, 2016 at 9:25 AM, Dave Stibrany <dstibr...@gmail.com> wrote:

> Thanks for the advice, Rick.
>
> I have an 8 disk chassis, so possible extension paths down the line are
> adding raid1 for WALs, adding another RAID10, or creating a 8 disk RAID10.
> Would LVM make this type of addition easier?
>
>
> On Wed, Feb 24, 2016 at 6:08 AM, Rick Otten <rottenwindf...@gmail.com>
> wrote:
>
>>
>> 1) I'd go with xfs.  zfs might be a good alternative, but the last time I
>> tried it, it was really unstable (on Linux).  I may have gotten a lot
>> better, but xfs is a safe bet and well understood.
>>
>> 2) An LVM is just an extra couple of commands.  These days that is not a
>> lot of complexity given what you gain. The main advantage is that you can
>> extend or grow the file system on the fly.  Over the life of the database
>> it is quite possible you'll find yourself pressed for disk space - either
>> to drop in more csv files to load with the 'copy' command, to store more
>> logs (because you need to turn up logging verbosity, etc...), you need more
>> transaction logs live on the system, you need to take a quick database
>> dump, or simply you collect more data than you expected.  It is not always
>> convenient to change the log location, or move tablespaces around to make
>> room.  In the cloud you might provision more volumes and attach them to the
>> server.  On a SAN you might attach more disk, and with a stand alone
>> server, you might stick more disks on the server.  In all those scenarios,
>> being able to simply merge them into your existing volume can be really
>> handy.
>>
>> 3) The main advantage of partitioning a single volume (these days) is
>> simply that if one partition fills up, it doesn't impact the rest of the
>> system.  Putting things that are likely to fill up the disk on their own
>> partition is generally a good practice.   User home directories is one
>> example.  System logs.  That sort of thing.  Isolating them on their own
>> partition will improve the long term reliability of your database.   The
>> main disadvantage is those things get boxed into a much smaller amount of
>> space than they would normally have if they could share a partition with
>> the whole system.
>>
>>
>> On Tue, Feb 23, 2016 at 11:28 PM, dstibrany <dstibr...@gmail.com> wrote:
>>
>>> I'm about to install a new production server and wanted some advice
>>> regarding
>>> filesystems and disk partitioning.
>>>
>>> The server is:
>>> - Dell PowerEdge R430
>>> - 1 x Intel Xeon E5-2620 2.4GHz
>>> - 32 GB RAM
>>> - 4 x 600GB 10k SAS
>>> - PERC H730P Raid Controller with 2GB cache
>>>
>>> The drives will be set up in one RAID-10 volume and I'll be installing
>>> Ubuntu 14.04 LTS as the OS. The server will be dedicated to running
>>> PostgreSQL.
>>>
>>> I'm trying to decide:
>>>
>>> 1) Which filesystem to use (most people seem to suggest xfs).
>>> 2) Whether to use LVM (I'm leaning against it because it seems like it
>>> adds
>>> additional complexity).
>>> 3) How to partition the volume. Should I just create one partition on /
>>> and
>>> create a 16-32GB swap partition? Any reason to get fancy with additional
>>> partitions given it's all on one volume?
>>>
>>> I'd like to keep things simple to start, but not shoot myself in the
>>> foot at
>>> the same time.
>>>
>>> Thanks!
>>>
>>> Dave
>>>
>>>
>>>
>>> --
>>> View this message in context:
>>> http://postgresql.nabble.com/Filesystem-and-Disk-Partitioning-for-New-Server-Setup-tp5889074.html
>>> Sent from the PostgreSQL - performance mailing list archive at
>>> Nabble.com.
>>>
>>>
>>> --
>>> Sent via pgsql-performance mailing list (
>>> pgsql-performance@postgresql.org)
>>> To make changes to your subscription:
>>> http://www.postgresql.org/mailpref/pgsql-performance
>>>
>>
>>
>>
>
>
> --
> *THIS IS A TEST*
>

Fwd: [PERFORM] Filesystem and Disk Partitioning for New Server Setup

2016-02-24 Thread Rick Otten

1) I'd go with xfs.  zfs might be a good alternative, but the last time I
tried it, it was really unstable (on Linux).  I may have gotten a lot
better, but xfs is a safe bet and well understood.

2) An LVM is just an extra couple of commands.  These days that is not a
lot of complexity given what you gain. The main advantage is that you can
extend or grow the file system on the fly.  Over the life of the database
it is quite possible you'll find yourself pressed for disk space - either
to drop in more csv files to load with the 'copy' command, to store more
logs (because you need to turn up logging verbosity, etc...), you need more
transaction logs live on the system, you need to take a quick database
dump, or simply you collect more data than you expected.  It is not always
convenient to change the log location, or move tablespaces around to make
room.  In the cloud you might provision more volumes and attach them to the
server.  On a SAN you might attach more disk, and with a stand alone
server, you might stick more disks on the server.  In all those scenarios,
being able to simply merge them into your existing volume can be really
handy.

3) The main advantage of partitioning a single volume (these days) is
simply that if one partition fills up, it doesn't impact the rest of the
system.  Putting things that are likely to fill up the disk on their own
partition is generally a good practice.   User home directories is one
example.  System logs.  That sort of thing.  Isolating them on their own
partition will improve the long term reliability of your database.   The
main disadvantage is those things get boxed into a much smaller amount of
space than they would normally have if they could share a partition with
the whole system.

On Tue, Feb 23, 2016 at 11:28 PM, dstibrany  wrote:

> I'm about to install a new production server and wanted some advice
> regarding
> filesystems and disk partitioning.
>
> The server is:
> - Dell PowerEdge R430
> - 1 x Intel Xeon E5-2620 2.4GHz
> - 32 GB RAM
> - 4 x 600GB 10k SAS
> - PERC H730P Raid Controller with 2GB cache
>
> The drives will be set up in one RAID-10 volume and I'll be installing
> Ubuntu 14.04 LTS as the OS. The server will be dedicated to running
> PostgreSQL.
>
> I'm trying to decide:
>
> 1) Which filesystem to use (most people seem to suggest xfs).
> 2) Whether to use LVM (I'm leaning against it because it seems like it adds
> additional complexity).
> 3) How to partition the volume. Should I just create one partition on / and
> create a 16-32GB swap partition? Any reason to get fancy with additional
> partitions given it's all on one volume?
>
> I'd like to keep things simple to start, but not shoot myself in the foot
> at
> the same time.
>
> Thanks!
>
> Dave
>
>
>
> --
> View this message in context:
> http://postgresql.nabble.com/Filesystem-and-Disk-Partitioning-for-New-Server-Setup-tp5889074.html
> Sent from the PostgreSQL - performance mailing list archive at Nabble.com.
>
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>

Fwd: [PERFORM] Cloud versus buying my own iron

2016-02-24 Thread Rick Otten

Having gotten used to using cloud servers over the past few years, but been
a server hugger for more than 20 before that, I have to say the cloud
offers a number of huge advantages that would make me seriously question
whether there are very many good reasons to go back to using local iron at
all.  (Other than maybe running databases on your laptop for development
and testing purposes.)

Rackspace offers 'bare metal' servers if you want consistent performance.
You don't have to pay for a managed solution, there are a lot of tiers of
service.  AWS also offers solutions that are not on shared platforms.  (AWS
tends to be much more expensive and, in spite of the myriad [proprietary]
industry leading new features, actually a little less flexible and with
poorer support.)

The main advantage of cloud is the ability to be agile.  You can upsize,
downsize, add storage, move data centers, and adapt to changing business
requirements on the fly.   Even with overnight shipping and a minimal
bureaucracy - selecting new hardware, getting approval to purchase it,
ordering it, unboxing it, setting it up and testing it, and then finally
getting to installing software - can take days or weeks of your time and
energy.  In the cloud, you just click a couple of buttons and then get on
with doing the stuff that really adds value to your business.

I spent the better part of a couple of decades ordering servers and disks
and extra cpu boards for big and small companies and getting them in the
servers and provisioning them.   Now that I use the cloud I just reach over
with my mouse, provision an volume, attach it to the server, and voila -
I've averted a disk space issue.   I take an image, build a new server,
swing DNS, and - there you have it - I'm now on a 16 cpu system instead of
an 8 cpu system.  Hours, at most, instead of weeks.   I can spend my time
worrying about business problems and data science.

Every 6 months to a year both Rackspace and AWS offer new classes of
servers with new CPU's and faster backplanes and better performance for the
buck.  With only a little planning, you can jump into the latest hardware
every time they do so.  If you have your own iron, you are likely to be
stuck on the same hardware for 3 or more years before you can upgrade again.

If the platform you are on suffers a catastropic hardware failure, it
usually only takes a few minutes to bring up a new server on new hardware
and be back and running again.

Yes, there is a premium for the flexibility and convenience.  Surprisingly
though, I think by the time you add in electricity and cooling and labor
and shipping and switches and racks and cabling, you may find that even
with their margin, their economy of scale actually offers a better total
real cost advantage.  (I've heard some arguments to the contrary, but I'm
not sure I believe them if the cloud infrastructure is well managed.)
 Throw in the instant deep technical support you can get from some place
like Rackspace when things go wrong, and I find few advantages to being a
server hugger any more.

On Wed, Feb 24, 2016 at 4:01 AM, Gunnar "Nick" Bluth <
gunnar.bluth.ext...@elster.de> wrote:

> Am 24.02.2016 um 06:06 schrieb Craig James:
> > At some point in the next year we're going to reconsider our hosting
> > environment, currently consisting of several medium-sized servers (2x4
> > CPUs, 48GB RAM, 12-disk RAID system with 8 in RAID 10 and 2 in RAID 1
> > for WAL). We use barman to keep a hot standby and an archive.
> >
> > The last time we dug into this, we were initially excited, but our
> > excitement turned to disappointment when we calculated the real costs of
> > hosted services, and the constraints on performance and customizability.
> >
> > Due to the nature of our business, we need a system where we can install
> > plug-ins to Postgres. I expect that alone will limit our choices. In
> > addition to our Postgres database, we run a fairly ordinary Apache web
> site.
> >
> > There is constant chatter in this group about buying servers vs. the
> > various hosted services. Does anyone have any sort of summary comparison
> > of the various solutions out there? Or is it just a matter of
> > researching it myself and maybe doing some benchmarking and price
> > comparisons?
>
> For starters, did you see Josh Berkus' presentation on the topic?
>   https://www.youtube.com/watch?v=WV5P2DgxPoI
>
> I for myself would probably always go the "own iron" road, but alas!
> that's just the way I feel about control. And I'm kind of a Linux
> oldshot, so managing a (hosted root) server doesn't scare me off.
>
> OTOH, I do see the advantages of having things like monitoring, backup,
> HDD replacements etc. done for you. Which is essentially what you pay for.
>
> In essence, there's obviously no silver bullet ;-)
>
> Best regards,
> --
> Gunnar "Nick" Bluth
> DBA ELSTER
>
> Tel:   +49 911/991-4665
> Mobil: +49 172/8853339
>
>
> --
> Sent via pgsql-performance mailing list

Re: [PERFORM] Connections "Startup"

2015-12-22 Thread Rick Otten

You can definitely overload most systems by trying to start too many
connections at once.  (This is actually true for most relational
databases.)  We used to see this scenario when we'd start a bunch web
servers that used preforked apache at the same time (where each fork had
its own connection).  One temporary work around is to slow start the web
cluster - bringing up one at a time and giving them a chance to complete.

You can kill the processes by looking for them on the unix prompt instead
of inside the database. ( 'ps -fu postgres' ) You can see where they are
coming from using something like 'netstat -an | grep 5432' (or whatever
port your database is listening on.

pgbouncer is a great solution for managing large connection sets that come
and go often.  It will really help.  You can run it directly on each of the
web servers or client systems, you can run it in between on its own
system(s), or you can run it on the database server (if necessary).  You'll
want to tune it so it only opens as many connections as you expect to be
running concurrent queries.  It takes a little experimenting to figure out
the optimum settings.   If you start pgbouncer first, you can bring up lots
of concurrent connections to pgbouncer, and you will hardly notice it on
the database.

Trying to stay current with the latest patches and releases is a lot of
work and little appreciated.  However, in the long run it is far easier to
tackle this incrementally than trying to do one big upgrade - skipping a
bunch of releases - every now and then.  This is true for the OS as well as
the Database.  It is not always possible to do an upgrade, and when it is,
it can take months of planning.  Hopefully you aren't in that situation.
Building processes that make these patches and upgrades routine is much
saner if you can.   One nice thing about having pgbouncer in between the
application and the database is you can reconfigure pgbouncer to talk to a
different database and you won't have to touch the application code at
all.  Sometimes that is easier to accomplish politically.   Swapping out a
database which is running behind a cluster of application servers with
minimal risk and minimal downtime is a technical as well as political
challenge, but worth it when you can get on the latest and greatest.  Good
Luck!

On Tue, Dec 22, 2015 at 3:09 AM, Pavel Stehule 
wrote:

> Hi
>
> 2015-12-22 8:59 GMT+01:00 Artem Tomyuk :
>
>> Hi.
>>
>> I've noticed huge decrease in performance.
>> During this in htop i see a lot (200 - 300) of connections in state
>> "startup", each of them eats 3-3% of CPU time. This processes are not
>> visible in pg_stat_activity so i cant understand what they are doing, and i
>> cant kill them. I cant see the bottleneck in Disk IO to. The logs of
>> postgres says nothing to. I am confused.
>> What can be the cause of  huge amount of "startup" connections
>> Maybe its better to start use connection pooler such as pgbouncer?
>> Thanks a lot.
>>
>
> What is your max_connections? Can you ran "perf top" ? What is there.
>
> Too high number can enforce system overloading. You cannot to see these
> connections in pg_stat_activity because the process in this state isn't
> fully initialized.
>
> There was lot of bugfix releases after 9.1.2 - currently there is
> PostgreSQL 9.2.19. Try to upgrade first.
>
> Regards
>
> Pavel
>
>
>>
>> PS.
>> Server config is:
>> 2 * Intel Xeon 2660 CPU with 64 gigs of RAM.
>> Hardware RAID10.
>> Centos 6.6, PostgreSQL 9.1.2
>>
>>
>>
>>
>

Re: [PERFORM] partitioned table set and indexes

2015-12-11 Thread Rick Otten

Ok, here is the first case where I select on the column:

http://explain.depesz.com/s/ECb

Here is the second case where I try a join:

http://explain.depesz.com/s/qIu

And here is the third case where I add a filter on the parent table:

   http://explain.depesz.com/s/1es

The primary use case for partitioning is for performance gains when working
with very large tables.  I agree these are not that large and by itself it
does not justify the extra complexity of working with partitioning.

However there are other use cases for the partitioning model.  In our case
we have legacy business processes that swap out the child tables and
operate on them independently from each other.  They could be refactored to
work together within one big table, but that is a project for another day.
  The segmentation of the data into structurally consistent but related
separate tables is the first step in that direction.  (previously they were
all different from each other, but similar too)   Some of these children
tables will hit 1M rows by the end of 2016, but it will take a while for
them to grow to that size.

I do have another table with many millions of rows that could use
partitioning, and eventually I'll split that one up - probably around the
time I merge this one into a single table.  First I have to finish getting
everything off of MySQL...

The query performance hit for sequence scanning isn't all that terrible,
but I'd rather understand and get rid of the issue if I can, now, before I
run into it again in a situation where it is crippling.

Thank you for your help with this!

--

ps:  You don't have to believe me about the bulk index adding thing.  I
hardly believe it myself. It is just something to keep an eye out for.  If
it is a real issue, I ought to be able to build a reproducible test case to
share - at that time I'll see if I can open it up as a real bug.  For now
I'd rather focus on understanding why my select uses an index and a join
won't.

On Fri, Dec 11, 2015 at 4:44 PM, Andreas Kretschmer <andr...@a-kretschmer.de
> wrote:

>
>
> > Rick Otten <rottenwindf...@gmail.com> hat am 11. Dezember 2015 um 21:40
> > geschrieben:
> >
> >
> > I do not know why if I blast a new index creation on the 20 or so
> children
> > all at once some of them fail, but then if I go back and do a few at a
> time
> > they all work.  It has happened to me 3 times now, so I'm pretty sure I'm
> > not imagining it.
>
> don't believe that, sorry.
>
>
> >
> > What specifically in the explain analyze output tells you that it is
> using
> > a sequence scan instead of an index scan _because_ there are too few
> rows?
> > I can see where it chooses a sequence scan over an index and I know there
> > are only a few rows in those tables, but I'm not sure how the explain
> > output tells you that it made that choice on purpose.
>
> a sequentiell scan over a small table are cheaper than an index-scan.
> Imageine a
> small table,
> only 3 rows. Fits in one page. It's cheaper to read just this page than
> read the
> index
> plus read the table to put out the result row.
>
>
> Why are you using partitioning? That's make only sense with large
> child-tables
> (more than 1 million rows or so)
> and if you have a proper partitioning schema.
>
>
>
> >
> > Why would the select statement use the index, but not the join?
> >
> > There used to be an explain output anonymizer tool, if I can find that
> > again, I'll send along the output.  It has been a few years since I
> posted
> > a question to this list so I don't think I have a bookmark for it any
> > more Hmmm.  I'll look around.
>
>
> http://explain.depesz.com/
>

Re: [PERFORM] partitioned table set and indexes

2015-12-11 Thread Rick Otten

I do not know why if I blast a new index creation on the 20 or so children
all at once some of them fail, but then if I go back and do a few at a time
they all work.  It has happened to me 3 times now, so I'm pretty sure I'm
not imagining it.

What specifically in the explain analyze output tells you that it is using
a sequence scan instead of an index scan _because_ there are too few rows?
I can see where it chooses a sequence scan over an index and I know there
are only a few rows in those tables, but I'm not sure how the explain
output tells you that it made that choice on purpose.

Why would the select statement use the index, but not the join?

There used to be an explain output anonymizer tool, if I can find that
again, I'll send along the output.  It has been a few years since I posted
a question to this list so I don't think I have a bookmark for it any
more Hmmm.  I'll look around.

Meanwhile:

--

  select
  *
  from
 my_parent_table
  where
 mypk = 'something';

Uses an index scan on each of my_parent_table's children except for a
couple of them that don't have a lot of rows, and those are sequence
scanned.  (which is ok)

--

   select
   *
   from
   some_other_table  sot
   join my_parent_table mpt on sot.some_column = mpt.mypk
  where
   sot.another_column = 'q'

Sequence scans each of my_parent_table's children.  (It doesn't matter
which order I put the join.)

--

select
*
from
   some_other_table  sot
   join my_parent_table mpt on sot.some_column = mpt.mypk
  where
   mpt.column_3 = 'z'
   and
   sot.another_column = 'q'

Index scans my_parent_table's children on column_3 (except for the couple
with only a few rows), and doesn't sequence scan for the mypk column at all.

On Fri, Dec 11, 2015 at 2:44 PM, Andreas Kretschmer <
akretsch...@spamfence.net> wrote:

> Rick Otten <rottenwindf...@gmail.com> wrote:
>
> > I'm using PostgreSQL 9.5 Beta 2.
> >
> > I am working with a partitioned table set.
> >
> > The first thing I noticed, when creating indexes on the 20 or so
> partitions,
> > was that if I create them too fast they don't all succeed.  I have to do
> a few
> > at a time, let them breathe for a few seconds, and then do a few more.
> I had
> > been simply generating all of the create index commands in a text
> editor, and
> > then cutting and pasting the lot of them into psql all at once or
> running them
> > by using psql '-f'.  Most would get created, but not all.  It seems
> almost
> > random.  There were no obvious error messages.  When I do a few at a
> time, it
> > is never an issue.
>
> Sure? Have you checked that?
>
>
> > If I do a simple query with a where clause on a specific column from the
> parent
> > table, I can see it index scan each of the children.  This is what I
> want it to
> > do, so no complaints there.
> >
> > However, if I try to (inner) join another table with that column, the
> planner
> > sequence scans each of the children instead of using the indexes.  I saw
> > someone had posted a similar question to this list back in January,
> however I
> > didn't see the answer.
>
> Show us the output from explain analyse 
>
>
> > FWIW, the column in question is a UUID column and is the primary key for
> each
> > of the child tables.
>
>
> PostgreSQL using a cost-modell, so maybe there are not enough rows in
> the table. That's just a guess, you can see that with explain analyse
> ...
>
>
> Andreas
> --
> Really, I'm not out to destroy Microsoft. That will just be a completely
> unintentional side effect.  (Linus Torvalds)
> "If I was god, I would recompile penguin with --enable-fly."   (unknown)
> Kaufbach, Saxony, Germany, Europe.  N 51.05082°, E 13.56889°
>
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>

[PERFORM] partitioned table set and indexes

2015-12-11 Thread Rick Otten

I'm using PostgreSQL 9.5 Beta 2.

I am working with a partitioned table set.

The first thing I noticed, when creating indexes on the 20 or so
partitions, was that if I create them too fast they don't all succeed.  I
have to do a few at a time, let them breathe for a few seconds, and then do
a few more.   I had been simply generating all of the create index commands
in a text editor, and then cutting and pasting the lot of them into psql
all at once or running them by using psql '-f'.  Most would get created,
but not all.  It seems almost random.  There were no obvious error
messages.  When I do a few at a time, it is never an issue.

This tripped me up because I couldn't figure out why some of the child
tables were sequence scanning and some were not.  It turned out that some
of the partitions were missing some of the indexes.   I'm mentioning it
here just in case someone else is observing strange behaviour where some
children are scanning and some aren't.  You might not have all of your
indexes deployed correctly.

--

Anyway, the issue I am trying to figure out at the moment:

If I do a simple query with a where clause on a specific column from the
parent table, I can see it index scan each of the children.  This is what I
want it to do, so no complaints there.

However, if I try to (inner) join another table with that column, the
planner sequence scans each of the children instead of using the indexes.
I saw someone had posted a similar question to this list back in January,
however I didn't see the answer.

What should I look at to try to figure out why a join doesn't use the
indexes while a straight query on the same column for the table does?

FWIW, the column in question is a UUID column and is the primary key for
each of the child tables.

--
Rick.

Re: [PERFORM] partitioned table set and indexes

2015-12-11 Thread Rick Otten

Why does it index scan when I use where, but not when I do a join?

On Fri, Dec 11, 2015 at 7:20 PM, Andreas Kretschmer <andr...@a-kretschmer.de
> wrote:

>
> > Rick Otten <rottenwindf...@gmail.com> hat am 11. Dezember 2015 um 23:09
> > geschrieben:
>
> >
> > The query performance hit for sequence scanning isn't all that terrible,
> > but I'd rather understand and get rid of the issue if I can, now, before
> I
> > run into it again in a situation where it is crippling.
>
> i think, you should try to understand how the planner works.
>
> a simple example:
>
> test=# create table foo (id serial primary key, val text);
> CREATE TABLE
> test=*# insert into foo (val) select repeat(md5(1::text), 5);
> INSERT 0 1
> test=*# analyse foo;
> ANALYZE
> test=*# explain analyse select val from foo where id=1;
>   QUERY PLAN
>
> ---
>  Seq Scan on foo  (cost=0.00..1.02 rows=1 width=164) (actual
> time=0.006..0.007
> rows=1 loops=1)
>Filter: (id = 1)
>Rows Removed by Filter: 1
>  Planning time: 0.118 ms
>  Execution time: 0.021 ms
> (5 rows)
>
>
> As you can see a seq-scan. It's a small table, costs ..1.02.
>
> Adding one row:
>
> test=*# insert into foo (val) select val from foo;
> INSERT 0 1
> test=*# analyse foo;
> ANALYZE
> test=*# explain analyse select val from foo where id=1;
>   QUERY PLAN
>
> ---
>  Seq Scan on foo  (cost=0.00..1.02 rows=1 width=164) (actual
> time=0.006..0.007
> rows=1 loops=1)
>Filter: (id = 1)
>Rows Removed by Filter: 1
>  Planning time: 0.118 ms
>  Execution time: 0.021 ms
> (5 rows)
>
>
> The same plan. Adding 2 rows:
>
> test=*# insert into foo (val) select val from foo;
> INSERT 0 2
> test=*# analyse foo;
> ANALYZE
> test=*# explain analyse select val from foo where id=1;
>   QUERY PLAN
>
> ---
>  Seq Scan on foo  (cost=0.00..1.05 rows=1 width=164) (actual
> time=0.220..0.277
> rows=1 loops=1)
>Filter: (id = 1)
>Rows Removed by Filter: 3
>  Planning time: 0.149 ms
>  Execution time: 0.453 ms
> (5 rows)
>
>
> The same plan. Adding more rows:
>
> test=*# insert into foo (val) select val from foo;
> INSERT 0 4
> test=*# insert into foo (val) select val from foo;
> INSERT 0 8
> test=*# insert into foo (val) select val from foo;
> INSERT 0 16
> test=*# insert into foo (val) select val from foo;
> INSERT 0 32
> test=*# insert into foo (val) select val from foo;
> INSERT 0 64
> test=*# insert into foo (val) select val from foo;
> INSERT 0 128
> test=*# insert into foo (val) select val from foo;
> INSERT 0 256
> test=*# insert into foo (val) select val from foo;
> INSERT 0 512
> test=*# insert into foo (val) select val from foo;
> INSERT 0 1024
> test=*# insert into foo (val) select val from foo;
> INSERT 0 2048
> test=*# insert into foo (val) select val from foo;
> INSERT 0 4096
> test=*# analyse foo;
> ANALYZE
> test=*# explain analyse select val from foo where id=1;
>QUERY PLAN
>
> 
>  Index Scan using foo_pkey on foo  (cost=0.28..8.30 rows=1 width=164)
> (actual
> time=0.007..0.008 rows=1 loops=1)
>Index Cond: (id = 1)
>  Planning time: 0.120 ms
>  Execution time: 0.024 ms
> (4 rows)
>
>
> We got a new plan! Index-Scan now. We are looking now in pg_class to see
> how
> many rows and pages we have:
>
> test=*# select relpages, reltuples from pg_class where relname = 'foo';
>  relpages | reltuples
> --+---
>   200 |  8192
> (1 row)
>
> How large ist the Index?
>
> test=*# select relpages, reltuples from pg_class where relname =
> 'foo_pkey';
>  relpages | reltuples
> --+---
>25 |  8192
> (1 row)
>
>
>
> So, now it's cheaper to read the index and than do an index-scan on the
> heap to
> read one record (our where-condition is on the primary key, so only one row
> expected, one page have to read with random access)
>
>
>
> It's simple math! If you want to learn more you can find a lot about that
> via
> google:
>
> https://www.google.de/?gws_rd=ssl#q=explaining+explain
>

Re: [PERFORM] Performance autovaccum

2013-07-10 Thread Rick Otten

In our use case, the default autovacuum settings did not work, I guess we are 
in the 5% group of users.  The default settings were too aggressive when it ran 
against some of our larger tables (example:  100M rows by 250 columns) in our 
front end OLTP database causing severe performance degradation.   We had to 
throttle it back (and haven't had problems since).

You should definitely run autovacuum.  If you are only able to experiment with 
it in production, I recommend taking it slow at first, and gradually making it 
more aggressive after you have a good handle on the impact (and observe it 
running on the larger critical tables in your data inventory without impact).  
You can start with the defaults, they aren't too bad.  In our case - a backend 
for a high performance, highly available website with a set of webservers that 
are very sensitive to even slight query time changes, the default settings 
simply consumed too much overhead.  I think in the end all we had to change was 
the autovacuum_vacuum_cost_delay to make enough difference to keep our site up 
and running.  You should review the tuning options though.

The problem is if you kill the autovacuum process because you suspect it is 
causing issues during a crisis, the autovacuumer will just start it back up 
again a few minutes later.  If you disable it permanently your query 
performance will likely slowly degrade.  You need to find somewhere in between.

Oh - one last note:  If you have a partitioned table (partitioned by date), 
autovacuum will not run against the older partitions (because they are no 
longer changing).  If you've had autovacuum off for a while, you may need to go 
back and manually vacuum analyze the older partitions to clean them up after 
you get autovacuum running.  (ditto for other old tables that are no longer 
changing)


-Original Message-
From: pgsql-performance-ow...@postgresql.org 
[mailto:pgsql-performance-ow...@postgresql.org] On Behalf Of Joshua D. Drake
Sent: Tuesday, July 09, 2013 8:42 PM
To: Josh Berkus
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Performance autovaccum


On 07/09/2013 03:14 PM, Josh Berkus wrote:

 On 07/08/2013 09:14 AM, Jeison Bedoya wrote:
 Hi, i have a postgresql 9.2.2, but i don´t use autovaccum but i want 
 to begin to use it.  some recommendation about the optimal configuration?
 or some link to explain it.

 Initial configuration:

 autovacuum = on

 There, you're done.  You only do something else if the default 
 configuraiton is proven not to work for you.


Well, and a restart of PostgreSQL. It should also be noted that autovacuum by 
default is on. You can check to see if it is currently running for you by 
issuing the following command from psql:

show autovacuum;

Other than that JoshB is correct. The default settings for autovacuum work for 
the 95% of users out there.

JD



--
Command Prompt, Inc. - http://www.commandprompt.com/  509-416-6579 PostgreSQL 
Support, Training, Professional Services and Development High Availability, 
Oracle Conversion, Postgres-XC, @cmdpromptinc For my dreams of your image that 
blossoms
a rose in the deeps of my heart. - W.B. Yeats


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] New server setup

2013-03-15 Thread Rick Otten

 I not convinced about the need for BBU with SSD - you *can* use them 
 without one, just need to make sure about suitable longevity and also 
 the presence of (proven) power off protection (as discussed 
 previously). It is worth noting that using unproven or SSD known to be 
 lacking power off protection with a BBU will *not* save you from 
 massive corruption (or device failure) upon unexpected power loss.

I don't think any drive that corrupts on power-off is suitable for a database, 
but for non-db uses, sure, I guess they are OK, though you have to be pretty 
money-constrainted to like that tradeoff.

Wouldn't mission critical databases normally be configured in a high 
availability cluster - presumably with replicas running on different power 
sources?

If you lose power to a member of the cluster (or even the master), you would 
have new data coming in and stuff to do long before it could come back online - 
corrupted disk or not.

I find it hard to imagine configuring something that is too critical to be able 
to be restored from periodic backup to NOT be in a (synchronous) cluster.  I'm 
not sure all the fuss over whether an SSD might come back after a hard server 
failure is really about.  You should architect the solution so you can lose the 
server and throw it away and never bring it back online again.  Native 
streaming replication is fairly straightforward to configure.   Asynchronous 
multimaster (albeit with some synchronization latency) is also fairly easy to 
configure using third party tools such as SymmetricDS.

Agreed that adding a supercap doesn't sound like a hard thing for a hardware 
manufacturer to do, but I don't think it should be a necessarily be showstopper 
for being able to take advantage of some awesome I/O performance opportunities.







-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] NEED REPLICATION SOLUTION -POSTGRES 9.1

2012-11-28 Thread Rick Otten

I recommend SymmetricDS - http://www.symmetricds.org

-Original Message-
From: pgsql-performance-ow...@postgresql.org 
[mailto:pgsql-performance-ow...@postgresql.org] On Behalf Of suhas.basavaraj12
Sent: Wednesday, November 28, 2012 1:12 PM
To: pgsql-performance@postgresql.org
Subject: [PERFORM] NEED REPLICATION SOLUTION -POSTGRES 9.1

Hi, 

We are planning to migrate  our production  databases to different servers.We 
have around  8 servers with 8 different clusters.We are planning t shuffle 
databases and make them as 7 cluster and migrate to new remote servers . 
We cannot use streaming replication as we are migrating different databases 
from different clusters to one single cluster . This  will be resulting in huge 
downtime  as data is huge . 
Need expert advice on this scenario.Can we reduce downtime in any way ..?? 

Rgrds
Suhas

--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/NEED-REPLICATION-SOLUTION-POSTGRES-9-1-tp5733939.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] average query performance measuring

2012-08-22 Thread Rick Otten

Thanks!  That looks like a handy tool.  

I think in this case we'll wait for 9.2.  We are looking forward to it.

-Original Message-
From: pgsql-performance-ow...@postgresql.org 
[mailto:pgsql-performance-ow...@postgresql.org] On Behalf Of Tomas Vondra
Sent: Tuesday, August 21, 2012 5:08 PM
To: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] average query performance measuring

On 21.8.2012 20:35, Rick Otten wrote:
 I have a PostgreSQL 9.1 cluster.  Each node is serving around 1,000 
 queries per second when we are at a 'steady state'.

 What I'd like to know is the average query time.  I'd like to see if 
 query performance is consistent, or if environmental changes, or code
 releases, are causing it to drift, spike, or change.   I'd also like to
 be able to compare the (real) query performance on the different nodes.

 I know I can put some sort of query wrapper at the application layer 
 to gather and store timing info.  (I'm not sure yet how the 
 application would know which node the query just ran on since we are using 
 pgpool
 between the app and the db.)   I'd much rather get something directly
 out of each database node if I can.

 Turning on statement logging crushes the database performance, so I 
 don't want to do that either.  (Not to mention I'd still have to parse 
 the logs to get the data.)

 It seems like we almost have everything we need to track this in the 
 stats tables, but not quite.  I was hoping the folks on this list 
 would have some tips on how to get query performance trends over time 
 out of each node in my cluster.

As others already mentioned, the improvements in pg_stat_statements by Peter 
Geoghean in 9.2 is the first thing you should look into I guess.
Especially if you're looking for per-query stats.

If you're looking for global stats, you might be interested in an extension I 
wrote a few months ago and collects query histogram. It's available on 
pgxn.org: http://pgxn.org/dist/query_histogram/

The question is whether tools like this can give you reliable answers to your 
questions - that depends on your workload (how much it varies) etc.

Tomas

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

[PERFORM] average query performance measuring

2012-08-21 Thread Rick Otten

I have a PostgreSQL 9.1 cluster.  Each node is serving around 1,000 queries per 
second when we are at a 'steady state'.

What I'd like to know is the average query time.  I'd like to see if query 
performance is consistent, or if environmental changes, or code releases, are 
causing it to drift, spike, or change.   I'd also like to be able to compare 
the (real) query performance on the different nodes.

I know I can put some sort of query wrapper at the application layer to gather 
and store timing info.  (I'm not sure yet how the application would know which 
node the query just ran on since we are using pgpool between the app and the 
db.)   I'd much rather get something directly out of each database node if I 
can.

Turning on statement logging crushes the database performance, so I don't want 
to do that either.  (Not to mention I'd still have to parse the logs to get the 
data.)

It seems like we almost have everything we need to track this in the stats 
tables, but not quite.  I was hoping the folks on this list would have some 
tips on how to get query performance trends over time out of each node in my 
cluster.

Thanks!

--
Rick Otten
Data-Systems Engineer
rot...@manta.com
Manta.comhttp://manta.com/?referid=emailSig Where Small Business Grows(tm)

41 matches

Mail list logo