Performance regression related to parallel execution

2019-03-11 Thread Jinho Jung
Hello, We noticed that the following SQL queries are running 3 times slower on the latest version of PostgreSQL. Here’s the time taken to execute them on older (v9.5.16) and newer versions (v11.2) of PostgreSQL (in milliseconds): +---++-+-+---+

Performance regressions found using sqlfuzz

2019-03-06 Thread Jinho Jung
= on-going discussion: https://www.postgresql.org/message-id/flat/BN6PR07MB3409EE6CAAF8CCF43820AFB9EE670%40BN6PR07MB3409.namprd07.prod.outlook.com#acc68f0fbd8f0b207e162d2dd401d3e8 Hello Andres, Could you please share your thoughts on QUERY 3? The performance i

Re: JIT performance question

2019-03-06 Thread Andres Freund
Hi, On 2019-03-06 19:21:33 +0100, Tobias Gierke wrote: > On 06.03.19 18:42, Andres Freund wrote: > > > > It's hard to know precisely without running a profile of the > > workload. My suspicion is that the bottleneck in this query is the use > > of numeric, which has fairly slow operations,

Re: JIT performance question

2019-03-06 Thread Tobias Gierke
On 06.03.19 18:42, Andres Freund wrote: It's hard to know precisely without running a profile of the workload. My suspicion is that the bottleneck in this query is the use of numeric, which has fairly slow operations, including aggregation. And they're too complicated to be inlined. Generally

Re: JIT performance question

2019-03-06 Thread Andres Freund
cuum & analyze afterwards) > > 3.) Disable parallel workers to just measure JIT performance via 'set > max_parallel_workers = 0' FWIW, it's better to do that via max_parallel_workers_per_gather in most cases, because creating a parallel plan and then not using that will have its own c

JIT performance question

2019-03-06 Thread Tobias Gierke
numeric,data1 numeric,data2 numeric,data3 numeric,...,data192 numeric,data193 numeric,data194 numeric); 2.) bulk-loaded (via COPY) 2 mio. rows of randomly generated data into this table (and ran vacuum & analyze afterwards) 3.) Disable parallel workers to just measure JIT performance via

Re: Performance regressions found using sqlfuzz

2019-02-28 Thread Jung, Jinho
Hi Andres: Could you please share your thoughts on QUERY 3? The performance impact of this regression increases *linearly* on larger databases. We concur with Andrew in that this is related to the lack of a Materialize node and mis-costing of the Nested Loop Anti-Join. We found more than 20

Re: neither CPU nor IO bound, but throttled performance

2019-02-21 Thread Gunther Schadow
Thank you Magnus. 68% steal. Indeed. You probably hit the target. Yes. That explains the discrepancy. I need to watch and understand that CPU credits issue. regards, -Gunther On 2/21/2019 4:08, Magnus Hagander wrote: On Thu, Feb 21, 2019 at 12:34 AM Gunther > wrote:

Re: neither CPU nor IO bound, but throttled performance

2019-02-21 Thread Magnus Hagander
On Thu, Feb 21, 2019 at 12:34 AM Gunther wrote: > Hi, I have an Amazon Linux based Postgresql 11 server here on a t2.medium > EC2 instance. > > It is serving 24 worker processes that read jobs from a queue (thanks to > SELECT ... FOR UPDATE SKIP LOCKED!) and do jobs some of which are reading >

Re: neither CPU nor IO bound, but throttled performance

2019-02-20 Thread Justin Pryzby
this here so I could see which tables or indexes would be bottlenecks. What was the old storage configuration and what is it now ? > So how can it be that queries take quite long without the process running at > higher CPU%? You said everything flows nicely, but take a long time, and &qu

Re: How can sort performance be so different

2019-02-20 Thread Peter Geoghegan
On Wed, Feb 20, 2019 at 2:25 PM Peter Geoghegan wrote: > I suspect that the reasons why the Lao locale sorts so much slower may > also have something to do with the intrinsic cost of supporting more > complicated rules. I strongly suspect that it has something to do with the issue described here

neither CPU nor IO bound, but throttled performance

2019-02-20 Thread Gunther
Hi, I have an Amazon Linux based Postgresql 11 server here on a t2.medium EC2 instance. It is serving 24 worker processes that read jobs from a queue (thanks to SELECT ... FOR UPDATE SKIP LOCKED!) and do jobs some of which are reading and writing business data to the database, others are only

Re: How can sort performance be so different

2019-02-20 Thread Peter Geoghegan
On Wed, Feb 20, 2019 at 1:42 PM Bob Jolliffe wrote: > It seems not to be (completely) particular to the installation. > Testing on different platforms we found variable speed difference > between 100x and 1000x slower, but always a considerable order of > magnitiude. The very slow

Re: How can sort performance be so different

2019-02-20 Thread Bob Jolliffe
ther (generally western) collations. > Did not observe anything nearly so bad. What I'm hoping is that this > is some kind of weird performance issue specific to your installation; > in the worst (unfortunately likely) case we are looking at something > specific to your specific

Re: How can sort performance be so different

2019-02-20 Thread Merlin Moncure
; having solved the immediate problem we haven't investigated much > further yet. > > Not sure what exactly you mean by "other conversions"? I hand tested similar query for other (generally western) collations. Did not observe anything nearly so bad. What I'm hoping is that this is so

Re: Performance regressions found using sqlfuzz

2019-02-19 Thread Jung, Jinho
Andres, Andrew, and Tom: Thanks for your insightful comments! We conducted additional analysis based on your comments and would like to share the results. We would also like to get your feedback on a few design decisions to increase the utility of our performance regression reports

Re: Performance regressions found using sqlfuzz

2019-02-16 Thread Tom Lane
Andres Freund writes: > On 2019-02-14 17:27:40 +, Jung, Jinho wrote: >> - Our analysis: We believe that this regression has to do with two factors: >> 1) conditional expression (e.g., LEAST or NULLIF) are not reduced to >> constants unlike string functions (e.g., CHAR_LENGTH) 2) change in

Re: Performance regressions found using sqlfuzz

2019-02-15 Thread Andres Freund
-> Index Scan using item_pkey on item ref_0 > (cost=0.29..8.31 rows=1 width=4) (actual time=0.008..0.008 rows=1 loops=1) > Index Cond: (i_id = 10) > Planning Time: 0.341 ms > Execution Time: 896.662 ms This seems perfectly alrig

Re: Performance regressions found using sqlfuzz

2019-02-15 Thread Jung, Jinho
th=20) (actual time=0.000..0.000 rows=0 loops=30044) One-Time Filter: false Planning Time: 0.350 ms Execution Time: 79.237 ms From: Jeff Janes Sent: Tuesday, February 12, 2019 1:03 PM To: Jung, Jinho Cc: pgsql-performa...@postgresql.o

Re: Q on SQL Performance tuning

2019-02-14 Thread Greg Stark
On Sun, 27 Jan 2019 at 06:29, legrand legrand wrote: > > Hi, > > There are many tools: > - (core) extension pg_stat_statements will give you informations of SQL > executions, I've had enormous success using pg_stat_statements and gathering the data over time in Prometheus. That let me build a

Re: Performance regressions found using sqlfuzz

2019-02-12 Thread Jeff Janes
On Tue, Feb 12, 2019 at 4:23 AM Jung, Jinho wrote: > > Hello, > > We are developing a tool called sqlfuzz for automatically finding > performance regressions in PostgreSQL. sqlfuzz performs mutational fuzzing > to generate SQL queries that take more time to execute on th

Re: Performance regressions found using sqlfuzz

2019-02-12 Thread Christoph Berg
Re: Jung, Jinho 2019-02-11 > We are developing a tool called sqlfuzz for automatically finding performance > regressions in PostgreSQL. sqlfuzz performs mutational fuzzing to generate > SQL queries that take more time to execute on the latest version of > PostgreSQL compared to pr

Performance regressions found using sqlfuzz

2019-02-12 Thread Jung, Jinho
Hello, We are developing a tool called sqlfuzz for automatically finding performance regressions in PostgreSQL. sqlfuzz performs mutational fuzzing to generate SQL queries that take more time to execute on the latest version of PostgreSQL compared to prior versions. We hope that these queries

Re: How can sort performance be so different

2019-02-06 Thread Laurenz Albe
Merlin Moncure wrote: > yeah, probably. Having said that, I'm really struggling that it can > take take several minutes to sort such a small number of rows even > with location issues. I can sort rocks faster than that :-). > > Switching between various european collations, I'm seeing subsecond

Re: How can sort performance be so different

2019-02-05 Thread Merlin Moncure
On Thu, Jan 31, 2019 at 7:30 AM Bob Jolliffe wrote: > > Hi Peter > > I did check out using ICU and the performance does indeed seem > comparable with C locale: > > EXPLAIN ANALYZE select * from chart order by name COLLATE "lo-x-icu"; >

Re: How can sort performance be so different

2019-01-31 Thread Bob Jolliffe
Hi Peter I did check out using ICU and the performance does indeed seem comparable with C locale: EXPLAIN ANALYZE select * from chart order by name COLLATE "lo-x-icu";

Re: How can sort performance be so different

2019-01-30 Thread Bob Jolliffe
Hi Tom After much performance measuring of VPS I believe you are right in your suspicion about locale. The database is full of Laos characters (it is a government system in Laos). When I tested on my VPS (en_US.UTF-8) I get the crazy slow performance, whereas my laptop postgresql is C.UTF-8

Re: How can sort performance be so different

2019-01-29 Thread Tom Lane
Bob Jolliffe writes: > I wonder what can cause such a massive discrepancy in the sort time. Are you using the same locale (LC_COLLATE) setting on both machines? Some locales sort way slower than C locale does. That's not enough to explain a 1000X discrepancy --- I concur with the other opinions

Re: How can sort performance be so different

2019-01-29 Thread Saurabh Nanda
Run https://github.com/n-st/nench and benchmark the underlying vps first. On Tue 29 Jan, 2019, 11:59 PM Bob Jolliffe The following is output from analyzing a simple query on a table of > 13436 rows on postgresql 10, ubuntu 18.04. > > explain analyze select * from chart order by name; >

Re: How can sort performance be so different

2019-01-29 Thread Pavel Stehule
út 29. 1. 2019 v 19:29 odesílatel Bob Jolliffe napsal: > The following is output from analyzing a simple query on a table of > 13436 rows on postgresql 10, ubuntu 18.04. > > explain analyze select * from chart order by name; >QUERY PLAN > >

How can sort performance be so different

2019-01-29 Thread Bob Jolliffe
The following is output from analyzing a simple query on a table of 13436 rows on postgresql 10, ubuntu 18.04. explain analyze select * from chart order by name; QUERY PLAN

Re: Q on SQL Performance tuning

2019-01-27 Thread Justin Pryzby
On Sun, Jan 27, 2019 at 08:43:15AM +, Bhupathi, Kaushik (CORP) wrote: > 2) Is there anyway to know the historical execution plan details of a > particular SQL ? Per my understanding so far since there is no concept of > shared pool unlike Oracle every execution demands a new hard parse.

Re: Q on SQL Performance tuning

2019-01-27 Thread legrand legrand
a nice view of CPU IO per query - extension pg_stat_sql_plans (alpha) gives all of pg_stat_statements and much more (parsing time, planid, plan text, ...) Regards PAscal -- Sent from: http://www.postgresql-archive.org/PostgreSQL-performance-f2050081.html

Q on SQL Performance tuning

2019-01-27 Thread Bhupathi, Kaushik (CORP)
Hi Team, I've few Questions on SQL perf tuning. 1) Is there any SQL monitoring report that's available in Oracle. Highlight of the report is it tells the % of time spent on CPU & IO. And which step took how much % in overall execution. 2) Is there anyway to know the historical

Re: SELECT performance drop

2019-01-24 Thread Jan Nielsen
On Wed, Jan 23, 2019 at 12:37 PM legrand legrand < legrand_legr...@hotmail.com> wrote: > Hi, > is there an index on > fm_order(session_id,type)? > There isn't at the moment: table_name | index_name | column_name

Re: SELECT performance drop

2019-01-23 Thread Laurenz Albe
On Wed, 2019-01-23 at 10:28 -0700, Jan Nielsen wrote: > select > order0_.id as id1_7_, > order0_.created_by as created_2_7_, > order0_.created_date as created_3_7_, > order0_.last_modified_by as last_mod4_7_, > order0_.last_modified_date as last_mod5_7_,

RE:SELECT performance drop

2019-01-23 Thread legrand legrand
Hi, is there an index on fm_order(session_id,type) ? regards PAscal

Re: SELECT performance drop

2019-01-23 Thread Jan Nielsen
; > > - > Jim Finnerty, AWS, Amazon Aurora PostgreSQL > -- > Sent from: > http://www.postgresql-archive.org/PostgreSQL-performance-f2050081.html > >

Re: SELECT performance drop

2019-01-22 Thread Jan Nielsen
lace | public | fm_order | public | fm_person | public | fm_session | public | fm_trader | I suspect you'd say "not accurate"? :-o After ANALYZE, the performance is much better <https://explain.depesz.com/s/p9KX>. Thank you so much!

Re: Query Performance Issue

2018-12-29 Thread Jeff Janes
On Sat, Dec 29, 2018 at 1:58 AM David Rowley wrote: > On Sat, 29 Dec 2018 at 04:32, Justin Pryzby wrote: > > I think the solution is to upgrade (at least) to PG10 and CREATE > STATISTICS > > (dependencies). > > Unfortunately, I don't think that'll help this situation. Extended > statistics are

Re: Query Performance Issue

2018-12-28 Thread Justin Pryzby
On Thu, Dec 27, 2018 at 10:25:47PM +0300, neslişah demirci wrote: > Have this explain analyze output : > > *https://explain.depesz.com/s/Pra8a * On Sat, Dec 29, 2018 at 07:58:28PM +1300, David Rowley wrote: > On Sat, 29 Dec 2018 at 04:32, Justin Pryzby wrote:

Re: Query Performance Issue

2018-12-28 Thread Justin Pryzby
On Thu, Dec 27, 2018 at 10:25:47PM +0300, neslişah demirci wrote: > Have this explain analyze output : > > *https://explain.depesz.com/s/Pra8a * Row counts are being badly underestimated leading to nested loop joins: |Index Scan using

Re: Query Performance Issue

2018-12-28 Thread Alexey Bashtanov
*https://explain.depesz.com/s/Pra8a* Could you share the query itself please? And the tables definitions including indexes. work_mem : 8MB That's not a lot. The 16-batches hash join may have worked faster if you had resources to increase work_mem.

Query Performance Issue

2018-12-27 Thread neslişah demirci
Hi everyone , Have this explain analyze output : *https://explain.depesz.com/s/Pra8a * Appreciated for any help . *PG version* --- PostgreSQL 9.6.11 on

Re: Performance impact of updating target columns with unchanged values ON CONFLICT

2018-11-23 Thread Abi Noda
ate statement from performance point of view. The only thing that matters it whether actual value changed. Surely, this behaviour is limited to B-Tree indexes. * https://dba.stackexchange.com/questions/223231/performance-impact-of-updating-target-columns-with-same-values-on-conflict On Fri, Nov 23, 2018 a

Re: Performance impact of updating target columns with unchanged values ON CONFLICT

2018-11-23 Thread Abi Noda
Thanks Justin. Do you know if Postgres treats an UPDATE that sets the indexed columns set to the same previous values as a change? Or does it only count it as "changed" if the values are different. This is ambiguous to me. *> HOT solves this problem for a restricted but useful special case where

Re: Performance impact of updating target columns with unchanged values ON CONFLICT

2018-11-22 Thread Justin Pryzby
On Thu, Nov 22, 2018 at 01:31:10PM -0800, Abi Noda wrote: > In other words, is Postgres smart enough to not actually write to disk any > columns that haven’t changed value or update indexes based on those columns? You're asking about what's referred to as Heap only tuples:

Re: Performance impact of updating target columns with unchanged values ON CONFLICT

2018-11-22 Thread Abi Noda
hub_id` columns, is there any performance difference (or > other issues to be aware of) between the two bulk upsert operations below? > The difference is that in the first query, the `org_id` and `github_id` > columns are included in the UPDATE, whereas in the second query they are > not

Performance impact of updating target columns with unchanged values ON CONFLICT

2018-11-22 Thread Abi Noda
Given a table, `github_repos`, with a multi-column unique index on `org_id` and `github_id` columns, is there any performance difference (or other issues to be aware of) between the two bulk upsert operations below? The difference is that in the first query, the `org_id` and `github_id` columns

Re: NOT IN vs. NOT EXISTS performance

2018-11-09 Thread Lincoln Swaine-Moore
; > > > I understand that there is a slight difference in the meaning of the > two > > > expressions, in that NOT IN will produce NULL if there are any NULL > values > > > in the right hand side (in this case there are none, and the queries > should >

Re: NOT IN vs. NOT EXISTS performance

2018-11-09 Thread Merlin Moncure
> > > > I understand that there is a slight difference in the meaning of the two > > expressions, in that NOT IN will produce NULL if there are any NULL values > > in the right hand side (in this case there are none, and the queries should > > return the same COUNT). But if anyth

Re: NOT IN vs. NOT EXISTS performance

2018-11-08 Thread David Rowley
gt; expressions, in that NOT IN will produce NULL if there are any NULL values > in the right hand side (in this case there are none, and the queries should > return the same COUNT). But if anything, I would expect that to improve > performance of the NOT IN operation, since a single pass th

NOT IN vs. NOT EXISTS performance

2018-11-08 Thread Lincoln Swaine-Moore
Hi all, I've figured out how to solve the performance issues I've been encountering with a particular query, but I'm interested in better understanding the intuition behind why the better query is so much more performant. The query in question involves a NOT IN filter from a CTE: WITH temp

Re: Gained %20 performance after disabling bitmapscan

2018-10-26 Thread Jeff Janes
On Mon, Oct 22, 2018 at 3:20 AM Yavuz Selim Sertoglu < yavuzselim.serto...@medyasoft.com.tr> wrote: > Thanks for the reply Jeff, > > I know 20ms is nothing but it shows me that there is a problem with my > configuration. I want to find it. > This is a dangerous assumption. This is no

Ynt: Gained %20 performance after disabling bitmapscan

2018-10-22 Thread Yavuz Selim Sertoglu
: 85.839 ms shared_buffers is 256G effective_cache_size is 768G Database size about 90G Gönderen: Jeff Janes Gönderildi: 19 Ekim 2018 Cuma 22:40:57 Kime: Yavuz Selim Sertoglu Bilgi: pgsql-performance@lists.postgresql.org Konu: Re: Gained %20 performance after disabl

Ynt: Gained %20 performance after disabling bitmapscan

2018-10-22 Thread Yavuz Selim Sertoglu
Thanks for the reply Vladimir, I thought explain analyze is enough. I run vacuum analyze manually but it didn't work either. Gönderen: Vladimir Ryabtsev Gönderildi: 19 Ekim 2018 Cuma 21:09:03 Kime: Yavuz Selim Sertoglu Bilgi: pgsql-performance

Ynt: Gained %20 performance after disabling bitmapscan

2018-10-22 Thread Yavuz Selim Sertoglu
f the cost of passing each tuple (row) from worker to master backend. Queries written by developer team, I can only recommend them your suggestion. Gönderen: Tom Lane Gönderildi: 19 Ekim 2018 Cuma 16:52:04 Kime: Yavuz Selim Sertoglu Bilgi: pgsql-p

Re: Gained %20 performance after disabling bitmapscan

2018-10-19 Thread Jeff Janes
On Fri, Oct 19, 2018 at 3:19 AM Yavuz Selim Sertoglu < yavuzselim.serto...@medyasoft.com.tr> wrote: > Hi all, > > I have a problem with my query. Query always using parallel bitmap heap > scan. I've created an index with all where conditions and id but query does > not this index and continue to

Re: Gained %20 performance after disabling bitmapscan

2018-10-19 Thread Vladimir Ryabtsev
Yavuz, cannot add much to other points but as for index-only scan, an (auto)vacuum must be run in order to optimizer understand it can utilize index-only scan. Please check if autovacuum was run on the table after index creation and if no, run it manually. Vlad

Re: Gained %20 performance after disabling bitmapscan

2018-10-19 Thread Tom Lane
Yavuz Selim Sertoglu writes: > I have a problem with my query. Query always using parallel bitmap heap scan. Have you messed with the parallel cost parameters? It seems a bit surprising that this query wants to use parallelism at all. > Index Cond: (((mukellef_id)::text =

Re: Gained %20 performance after disabling bitmapscan

2018-10-19 Thread Justin Pryzby
On Fri, Oct 19, 2018 at 07:19:12AM +, Yavuz Selim Sertoglu wrote: > I have a problem with my query. Query always using parallel bitmap heap scan. > I've created an index with all where conditions and id but query does not > this index and continue to use bitmapscan. So I decided disable

Gained %20 performance after disabling bitmapscan

2018-10-19 Thread Yavuz Selim Sertoglu
Hi all, I have a problem with my query. Query always using parallel bitmap heap scan. I've created an index with all where conditions and id but query does not this index and continue to use bitmapscan. So I decided disable bitmap scan for testing. And after that, things became strange. Cost

Re: Why could different data in a table be processed with different performance?

2018-10-10 Thread Vladimir Ryabtsev
FYI, posting an intermediate update on the issue. I disabled index scans to keep existing order, and copied part of the "slow" range into another table (3M rows in 2.2 GB table + 17 GB toast). I was able to reproduce slow readings from this copy. Then I performed CLUSTER of the copy using PK and

Re: Why could different data in a table be processed with different performance?

2018-10-01 Thread Fabio Pardi
On 28/09/18 21:51, Vladimir Ryabtsev wrote: > > That means, if your block size was bigger, then you would have bigger space > > allocated for one single record. > But if I INSERT second, third ... hundredth record in the table, the size > remains 8K. > So my point is that if one decides to

Re: Why could different data in a table be processed with different performance?

2018-09-28 Thread Fabio Pardi
On 28/09/18 11:56, Vladimir Ryabtsev wrote: > > > It could affect space storage, for the smaller blocks. > But at which extent? As I understand it is not something about "alignment" to > block size for rows? Is it only low-level IO thing with datafiles? > Maybe 'for the smaller blocks' was

Re: Why could different data in a table be processed with different performance?

2018-09-28 Thread Vladimir Ryabtsev
> Does your LVM have readahead > ramped up ? Try lvchange -r 65536 data/postgres (or similar). Changed this from 256 to 65536. If it is supposed to take effect immediately (no server reboot or other changes), then I've got no changes in performance. No at all. Vlad

Re: Why could different data in a table be processed with different performance?

2018-09-27 Thread Fabio Pardi
On 09/26/2018 07:15 PM, Vladimir Ryabtsev wrote: >> Since you have a very big toast table, given you are using spinning > disks, I think that increasing the block size will bring benefits. > But will it worsen caching? I will have lesser slots in cache. Also will > it affect required storage

Re: Why could different data in a table be processed with different performance?

2018-09-26 Thread Vladimir Ryabtsev
> Since you have a very big toast table, given you are using spinning disks, I think that increasing the block size will bring benefits. But will it worsen caching? I will have lesser slots in cache. Also will it affect required storage space? >> consecutive runs with SAME parameters do NOT hit

Re: Why could different data in a table be processed with different performance?

2018-09-25 Thread Vladimir Ryabtsev
> 1) Which file system are you using? >From Linux's view it's ext4. Real vmdx file on Hyper-V is stored on NTFS, as far as I know. > 2) What is the segment layout of the LVM PVs and LVs? I am a bit lost with it. Is that what you are asking about? master: # pvs --segments PV VG

Re: Why could different data in a table be processed with different performance?

2018-09-25 Thread Fabio Pardi
On 25/09/18 00:28, Vladimir Ryabtsev wrote: > > > it is not unusual to have 1GB cache or more...  and do not forget to drop > > the cache between tests + do a sync > I conducted several long runs of dd, so I am sure that this numbers are > fairly correct. However, what worries me is that I

Re: Why could different data in a table be processed with different performance?

2018-09-25 Thread Gasper Zejn
ng 5) BCC is a collection of tools that might shed a light on what is happening. https://github.com/iovisor/bcc Kind regards, Gasper On 21. 09. 2018 02:07, Vladimir Ryabtsev wrote: > I am experiencing a strange performance problem when accessing JSONB > content by primary key. &

Re: Why could different data in a table be processed with different performance?

2018-09-24 Thread Justin Pryzby
On Mon, Sep 24, 2018 at 05:59:12PM -0700, Vladimir Ryabtsev wrote: > > This seems significant..it means the heap was probably written in > backwards > order relative to the IDs, and the OS readahead is ineffective when index > scanning across a range of IDs. > But again, why is it different for

Re: Why could different data in a table be processed with different performance?

2018-09-24 Thread Vladimir Ryabtsev
> If it does an index scan, I think that will badly fail to keep the same order of heap TIDs - it'll be inserting rows in ID order rather than in (I guess) reverse ID order. According to the plan, it's gonna be seq. scan with filter. Vlad

Re: Why could different data in a table be processed with different performance?

2018-09-24 Thread Vladimir Ryabtsev
> did you try either 1) forcing a bitmap scan (of only one index), to force the heap reads to be ordered, if not sequential? SET enable_indexscan=off (and maybe SET enable_seqscan=off and others as needed). Disabling index scan made it bitmap. It is surprising, but this increased read speed in

Re: Why could different data in a table be processed with different performance?

2018-09-24 Thread Vladimir Ryabtsev
> This seems significant..it means the heap was probably written in backwards order relative to the IDs, and the OS readahead is ineffective when index scanning across a range of IDs. But again, why is it different for one range and another? It was reversed for both ranges. > I would definitely

Re: Why could different data in a table be processed with different performance?

2018-09-24 Thread Justin Pryzby
On Mon, Sep 24, 2018 at 03:28:15PM -0700, Vladimir Ryabtsev wrote: > > it is not unusual to have 1GB cache or more... and do not forget to drop > the cache between tests + do a sync > I also reviewed import scripts and found the import was done in DESCENDING > order of IDs. This seems

Re: Why could different data in a table be processed with different performance?

2018-09-24 Thread Vladimir Ryabtsev
> You can create 2 partial indexes and the planner will pick it up for you. (and the planning time will go a bit up). Created two partial indexes and ensured planner uses it. But the result is still the same, no noticeable difference. > it is not unusual to have 1GB cache or more... and do not

Re: Why could different data in a table be processed with different performance?

2018-09-24 Thread Fabio Pardi
Hi, answers (and questions) in line here below On 22/09/18 11:19, Vladimir Ryabtsev wrote: > > is the length of the text equally distributed over the 2 partitions? > Not 100% equally, but to me it does not seem to be a big deal... Considering > the ranges independently: > First range: ~70% < 10

Re: Why could different data in a table be processed with different performance?

2018-09-24 Thread Vladimir Ryabtsev
> Another idea is that the operating system rearranges I/O in a way that is not ideal for your storage. > Try a different I/O scheduler by running echo deadline > /sys/block/sda/queue/scheduler My scheduler was already "deadline". In some places I read that in virtual environment sometimes "noop"

Re: Why could different data in a table be processed with different performance?

2018-09-22 Thread didier
Hi, Assuming DB is quiescent. And if you run? select count(*) from articles where article_id between %s and %s ie without reading json, is your buffers hit count increasing? 20 000 8K blocks *2 is 500MB , should be in RAM after the first run. Fast: read=710 I/O Timings: read=852.547 ==> 1.3

Re: Why could different data in a table be processed with different performance?

2018-09-22 Thread Vladimir Ryabtsev
> is the length of the text equally distributed over the 2 partitions? Not 100% equally, but to me it does not seem to be a big deal... Considering the ranges independently: First range: ~70% < 10 KB, ~25% for 10-20 KB, ~3% for 20-30 KB, everything else is less than 1% (with 10 KB steps). Second

Re: Why could different data in a table be processed with different performance?

2018-09-21 Thread Fabio Pardi
On 09/21/2018 08:28 AM, Vladimir Ryabtsev wrote: >> but you say you observe a difference even after dropping the cache. > No, I say I see NO significant difference (accurate to measurement > error) between "with caches" and after dropping caches. And this is > explainable, I think. Since I

Re: Why could different data in a table be processed with different performance?

2018-09-21 Thread Fabio Pardi
Hi Vladimir, On 09/21/2018 02:07 AM, Vladimir Ryabtsev wrote: > > I have such a table: > > CREATE TABLE articles > ( >     article_id bigint NOT NULL, >     content jsonb NOT NULL, >     published_at timestamp without time zone NOT NULL, >     appended_at timestamp without time zone NOT

Re: Why could different data in a table be processed with different performance?

2018-09-20 Thread Laurenz Albe
Vladimir Ryabtsev wrote: > explain (analyze, buffers) > select count(*), sum(length(content::text)) from articles where article_id > between %s and %s > > Sample output: > > Aggregate (cost=8635.91..8635.92 rows=1 width=16) (actual > time=6625.993..6625.995 rows=1 loops=1) > Buffers: shared

Re: Why could different data in a table be processed with different performance?

2018-09-20 Thread Justin Pryzby
Sorry, dropped -performance. >>>> Has the table been reindexed (or pg_repack'ed) since loading (or vacuumed >>>> for that matter) ? >>> Not sure what you mean... We created indexes on some fields (on >> I mean REINDEX INDEX articles_pkey; >> Or (fr

Re: Why could different data in a table be processed with different performance?

2018-09-20 Thread Vladimir Ryabtsev
Pryzby : > On Thu, Sep 20, 2018 at 05:07:21PM -0700, Vladimir Ryabtsev wrote: > > I am experiencing a strange performance problem when accessing JSONB > > content by primary key. > > > I noticed that with some IDs it works pretty fast while with other it is > > 4

Re: Why could different data in a table be processed with different performance?

2018-09-20 Thread Justin Pryzby
On Thu, Sep 20, 2018 at 05:07:21PM -0700, Vladimir Ryabtsev wrote: > I am experiencing a strange performance problem when accessing JSONB > content by primary key. > I noticed that with some IDs it works pretty fast while with other it is > 4-5 times slower. It is suitable to note, t

Why could different data in a table be processed with different performance?

2018-09-20 Thread Vladimir Ryabtsev
I am experiencing a strange performance problem when accessing JSONB content by primary key. My DB version() is PostgreSQL 10.3 (Ubuntu 10.3-1.pgdg14.04+1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 4.8.4-2ubuntu1~14.04.4) 4.8.4, 64-bit postgres.conf: https://justpaste.it/6pzz1 uname

pg_pub_decrypt: 10x performance hit with gpg v2

2018-09-18 Thread Felix A. Kater
Hi, pg_pub_decrypt() is ~10x slower when the priv/pub keys have been generated with gnupg version 2.x instead of version 1.x. What I do is: - Create keys with gpg - Export priv/pub keys - Store keys in binary form in a bytea - Create 32 byte random data and encrypt it with pg_pub_encrypt() -

Re: Performance of INSERT into temporary tables using psqlODBC driver

2018-09-16 Thread padusuma
Hello Tim, >Re-implementation of a solution is often a hard case to sell, but it >might be the only way to get the performance you want. The big positive >to a re-implementation is that you usually get a better solution because >you are implementing with more knowledge and expe

Re: Performance of INSERT into temporary tables using psqlODBC driver

2018-09-15 Thread Tim Cross
padusuma writes: > Hello Tim, > >>How are you gathering metrics to determine if performance has improved >>or not? > I am measuring the response times through timer for the execution of SQL > statements through psqlODBC driver. The response times for INSERT INTO >

Re: Performance of INSERT into temporary tables using psqlODBC driver

2018-09-15 Thread padusuma
Hello Tim, >How are you gathering metrics to determine if performance has improved >or not? I am measuring the response times through timer for the execution of SQL statements through psqlODBC driver. The response times for INSERT INTO temp-table statements have not changed with the para

Re: Performance of INSERT into temporary tables using psqlODBC driver

2018-09-13 Thread Tim Cross
padusuma writes: > Hello Tim, > > I have tried the suggestions provided to the best of my knowledge, but I did > not see any improvement in the INSERT performance for temporary tables. The > Linux host on which PostgreSQL database is installed has 32 GB RAM. > Following are

Re: Performance of INSERT into temporary tables using psqlODBC driver

2018-09-13 Thread padusuma
Hello Tim, I have tried the suggestions provided to the best of my knowledge, but I did not see any improvement in the INSERT performance for temporary tables. The Linux host on which PostgreSQL database is installed has 32 GB RAM. Following are current settings I have in postgresql.conf file

Re: Performance of INSERT into temporary tables using psqlODBC driver

2018-09-10 Thread padusuma
quently. Note that there is a play off here between frequency > of checkpoints and boot time after a crash. Fewer wal checkpoints will > usually improve performance, but recovery time is longer. >- Verify your inserts into temporary tables is the bottleneck and not > the

Re: Performance of INSERT into temporary tables using psqlODBC driver

2018-09-09 Thread Tim Cross
the driver you are using, that will >>give you the largest performance boost. > > The data to be inserted into temporary tables is obtained from one or more > queries run earlier and the data is available as a vector of strings. If I > need to use COPY FROM command, then the

Re: Performance of INSERT into temporary tables using psqlODBC driver

2018-09-09 Thread padusuma
est how to pass data from a client application to "COPY FROM STDIN" statement? Thanks. -- Sent from: http://www.postgresql-archive.org/PostgreSQL-performance-f2050081.html

Re: Performance of INSERT into temporary tables using psqlODBC driver

2018-09-09 Thread Sergei Kornilov
Hello > The data to be inserted into temporary tables is obtained from one or more > queries run earlier and the data is available as a vector of strings. You can not use "insert into temp_table select /*anything you wish*/" statement? Or even insert .. select ... returning if you need receive

Re: Performance of INSERT into temporary tables using psqlODBC driver

2018-09-07 Thread Tim Cross
Time taken to insert 32512 rows: 16.026 sec > Time taken to insert 32512 rows: 15.821 sec > Time taken to insert 6107 rows: 1.514 sec > > I am looking for suggestions to improve the performance of these INSERT > statements into temporary tables. Database is located on a Linux VM and th

Performance of INSERT into temporary tables using psqlODBC driver

2018-09-07 Thread padusuma
sec Time taken to insert 6107 rows: 1.514 sec I am looking for suggestions to improve the performance of these INSERT statements into temporary tables. Database is located on a Linux VM and the version is "PostgreSQL 10.4 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.4.7 20120313 (Red Hat 4.4

<    3   4   5   6   7   8   9   10   >