Performance problems with Postgres JDBC 42.4.2
Dear all, I'm running a query from Java on a postgres database: Java version: 17 JDBC version: 42.4.2 Postgres version: 13.1 In parallel I'm testing the same queries from pgAdmin 4 version 6.13 The tables I'm using contains more than 10million rows each and I have two questions here: 1. I need to extract the path of a file without the file itself. For this I use two alternatives as I found that sentence "A" is much faster than the "B" one: "A" sentence: SELECT DISTINCT ( LEFT(opf.file_path, length(opf.file_path) - position('/' in reverse(opf.file_path))) ) AS path FROM product AS op JOIN product_file AS opf ON opf.product_id = op.id WHERE op.proprietary_end_date <= CURRENT_DATE AND op.id LIKE 'urn:esa:psa:%' "B" sentence: SELECT DISTINCT ( regexp_replace(opf.file_path, '(.*)\/(.*)$', '\1') ) AS path FROM product AS op JOIN product_file AS opf ON opf.product_id = op.id WHERE op.proprietary_end_date <= CURRENT_DATE AND op.id LIKE 'urn:esa:psa:%' 2. Running sentence "A" on the pgAdmin client takes 4-5 minutes to finish but running it from a Java program it never ends. This is still the case when I limit the output to the first 100 rows so I assume this is not a problem with the amount of data being transferred but the way postgres resolve the query. To make it work in Java I had to define a postgres function that I call from the Java code instead of running the query directly. I had a similar problem in the past with a query that performed very poorly from a Java client while it was fine from pgAdmin or a python script. In that case it was a matter of column types not compatible with the JDBC (citext) deriving in an implicit cast that prevented the postgres engine from using a given index or to cast all the values of that column before using it, not sure now. But I don't think this is not the case here. Could anyone help me again? Many thanks in advance Jose
Re: Select on partitioned table is very slow
You are right but the correct query is the one in the attached file. What we want to do here is to force psql to send the "wrong data types" to postgres and, as a result of this, get a bad plan. Cheers, Jose Osinde On Thu, Aug 25, 2022 at 3:48 PM Ranier Vilela wrote: > Em qui., 25 de ago. de 2022 às 10:43, Jose Osinde > escreveu: > >> >> Dear Depesz, Laurenz, >> >> Thanks very much for the fast responses. They are actually correct and >> saved me a lot of time. I couldn't test the cast from the Java test but >> this is something I can deal with later on (most probably updating the >> column types to text in the database side instead). But what I could do was >> reproduce the same problem in the psql console using the cast in the other >> way. This sentence: >> >> explain analyze select logical_identifier, version_id, lastproduct >> FROM test_product_ui_partition.product_ui pui >> WHERE >> pui.mission_id='urn:esa:psa:context:investigation:mission.em16'::citext >> AND >> pui.logical_identifier='urn:esa:psa:em16_tgo_frd:data_raw:frd_raw_sc_n_20220729t00-20220729t235959'::citext; >> > The query in explain.txt attached, it seems not the same. > > explain analyze select logical_identifier, version_id, lastproduct >FROM test_product_ui_partition.product_ui pui > WHERE > pui.mission_id='urn:esa:psa:context:investigation:mission.em16'::text > AND > pui.logical_identifier='urn:esa:psa:em16_tgo_frd:data_raw:frd_raw_sc_n_20220729t00-20220729t235959'::text; > > ::text? > > regards, > > Ranier Vilela > >
Re: Select on partitioned table is very slow
Dear Depesz, Laurenz, Thanks very much for the fast responses. They are actually correct and saved me a lot of time. I couldn't test the cast from the Java test but this is something I can deal with later on (most probably updating the column types to text in the database side instead). But what I could do was reproduce the same problem in the psql console using the cast in the other way. This sentence: explain analyze select logical_identifier, version_id, lastproduct FROM test_product_ui_partition.product_ui pui WHERE pui.mission_id='urn:esa:psa:context:investigation:mission.em16'::citext AND pui.logical_identifier='urn:esa:psa:em16_tgo_frd:data_raw:frd_raw_sc_n_20220729t00-20220729t235959'::citext; Creates an output equivalent to that returned from the JAVA application and reproduces the exact same problems: Scans all the partitions instead of select the right one and uses sec scans for all the cases. Attached the result. Again, many thanks for your help, Jose Osinde explain analyze select logical_identifier, version_id, lastproduct FROM test_product_ui_partition.product_ui pui WHERE pui.mission_id='urn:esa:psa:context:investigation:mission.em16'::text AND pui.logical_identifier='urn:esa:psa:em16_tgo_frd:data_raw:frd_raw_sc_n_20220729t00-20220729t235959'::text; Gather (cost=1000.00..5617399.10 rows=19 width=82) (actual time=10485.082..10497.268 rows=1 loops=1) Workers Planned: 2 Workers Launched: 2 -> Parallel Append (cost=0.00..5616397.20 rows=15 width=82) (actual time=9221.542..10479.134 rows=0 loops=3) -> Parallel Seq Scan on product_ui_em16 pui_10 (cost=0.00..2603849.81 rows=3 width=112) (actual time=3883.905..5141.463 rows=0 loops=3) Filter: (((mission_id)::text = 'urn:esa:psa:context:investigation:mission.em16'::text) AND ((logical_identifier)::text = 'urn:esa:psa:em16_tgo_frd:data_raw:frd_raw_sc_n_20220729t00-20220729t235959'::text)) Rows Removed by Filter: 5337106 -> Parallel Seq Scan on product_ui_rosetta pui_6 (cost=0.00..2382752.79 rows=1 width=57) (actual time=6213.704..6213.704 rows=0 loops=2) Filter: (((mission_id)::text = 'urn:esa:psa:context:investigation:mission.em16'::text) AND ((logical_identifier)::text = 'urn:esa:psa:em16_tgo_frd:data_raw:frd_raw_sc_n_20220729t00-20220729t235959'::text)) Rows Removed by Filter: 5878988 -> Parallel Seq Scan on product_ui_mex pui_7 (cost=0.00..354434.31 rows=1 width=56) (actual time=1417.182..1417.186 rows=0 loops=1) Filter: (((mission_id)::text = 'urn:esa:psa:context:investigation:mission.em16'::text) AND ((logical_identifier)::text = 'urn:esa:psa:em16_tgo_frd:data_raw:frd_raw_sc_n_20220729t00-20220729t235959'::text)) Rows Removed by Filter: 1085822 -> Parallel Seq Scan on product_ui_smart1 pui_8 (cost=0.00..148741.56 rows=1 width=63) (actual time=1125.713..1125.713 rows=0 loops=1) Filter: (((mission_id)::text = 'urn:esa:psa:context:investigation:mission.em16'::text) AND ((logical_identifier)::text = 'urn:esa:psa:em16_tgo_frd:data_raw:frd_raw_sc_n_20220729t00-20220729t235959'::text)) Rows Removed by Filter: 1187167 -> Parallel Seq Scan on product_ui_vex pui_13 (cost=0.00..112914.84 rows=1 width=56) (actual time=919.386..919.386 rows=0 loops=1) Filter: (((mission_id)::text = 'urn:esa:psa:context:investigation:mission.em16'::text) AND ((logical_identifier)::text = 'urn:esa:psa:em16_tgo_frd:data_raw:frd_raw_sc_n_20220729t00-20220729t235959'::text)) Rows Removed by Filter: 1133382 -> Parallel Seq Scan on product_ui_bc pui_9 (cost=0.00..8890.52 rows=1 width=83) (actual time=78.910..78.910 rows=0 loops=1) Filter: (((mission_id)::text = 'urn:esa:psa:context:investigation:mission.em16'::text) AND ((logical_identifier)::text = 'urn:esa:psa:em16_tgo_frd:data_raw:frd_raw_sc_n_20220729t00-20220729t235959'::text)) Rows Removed by Filter: 67763 -> Parallel Seq Scan on product_ui_ch1 pui_2 (cost=0.00..2224.51 rows=1 width=59) (actual time=20.162..20.162 rows=0 loops=1) Filter: (((mission_id)::text = 'urn:esa:psa:context:investigation:mission.em16'::text) AND ((logical_identifier)::text = 'urn:esa:psa:em16_tgo_frd:data_raw:frd_raw_sc_n_20220729t00-20220729t235959'::text)) Rows Removed by Filter: 20345 -> Parallel Seq Scan on product_ui_huygens pui_1 (cost=0.00..2090.82 rows=1 width=70) (actual time=19.724..19.727 rows=0 loops=1) Filter: (((mission_id)::text = 'urn:esa:psa:context:investigation:mission.em16'::text) AND ((logical_identifier)::text = 'urn:esa:psa:em16_tgo_frd:data_raw:frd_raw_sc_n_20220729t00-20220729t235959'::text)) Rows Removed by Filter: 19700
Select on partitioned table is very slow
actual time=1280.037..1280.037 rows=0 loops=1) Filter: (((mission_id)::text = 'urn:esa:psa:context:investigation:mission.em16'::text) AND ((logical_identifier)::text = 'urn:esa:psa:em16_tgo_frd:data_raw:frd_raw_sc_n_20220729t00-20220729t235959'::text)) Rows Removed by Filter: 1085822 -> Parallel Seq Scan on product_ui_smart1 pui_8 (cost=0.00..148741.56 rows=1 width=63) (actual time=1045.532..1045.533 rows=0 loops=1) Filter: (((mission_id)::text = 'urn:esa:psa:context:investigation:mission.em16'::text) AND ((logical_identifier)::text = 'urn:esa:psa:em16_tgo_frd:data_raw:frd_raw_sc_n_20220729t00-20220729t235959'::text)) Rows Removed by Filter: 1187167 -> Parallel Seq Scan on product_ui_vex pui_13 (cost=0.00..112914.84 rows=1 width=56) (actual time=968.542..968.542 rows=0 loops=1) Filter: (((mission_id)::text = 'urn:esa:psa:context:investigation:mission.em16'::text) AND ((logical_identifier)::text = 'urn:esa:psa:em16_tgo_frd:data_raw:frd_raw_sc_n_20220729t00-20220729t235959'::text)) Rows Removed by Filter: 1133382 -> Parallel Seq Scan on product_ui_bc pui_9 (cost=0.00..8890.52 rows=1 width=83) (actual time=88.747..88.747 rows=0 loops=1) Filter: (((mission_id)::text = 'urn:esa:psa:context:investigation:mission.em16'::text) AND ((logical_identifier)::text = 'urn:esa:psa:em16_tgo_frd:data_raw:frd_raw_sc_n_20220729t00-20220729t235959'::text)) Rows Removed by Filter: 67763 -> Parallel Seq Scan on product_ui_ch1 pui_2 (cost=0.00..2224.51 rows=1 width=59) (actual time=23.304..23.304 rows=0 loops=1) Filter: (((mission_id)::text = 'urn:esa:psa:context:investigation:mission.em16'::text) AND ((logical_identifier)::text = 'urn:esa:psa:em16_tgo_frd:data_raw:frd_raw_sc_n_20220729t00-20220729t235959'::text)) Rows Removed by Filter: 20345 -> Parallel Seq Scan on product_ui_huygens pui_1 (cost=0.00..2090.82 rows=1 width=70) (actual time=21.786..21.786 rows=0 loops=1) Filter: (((mission_id)::text = 'urn:esa:psa:context:investigation:mission.em16'::text) AND ((logical_identifier)::text = 'urn:esa:psa:em16_tgo_frd:data_raw:frd_raw_sc_n_20220729t00-20220729t235959'::text)) Rows Removed by Filter: 19700 -> Parallel Seq Scan on product_ui_ground_based pui_3 (cost=0.00..260.74 rows=1 width=75) (actual time=2.615..2.616 rows=0 loops=1) Filter: (((mission_id)::text = 'urn:esa:psa:context:investigation:mission.em16'::text) AND ((logical_identifier)::text = 'urn:esa:psa:em16_tgo_frd:data_raw:frd_raw_sc_n_20220729t00-20220729t235959'::text)) Rows Removed by Filter: 2237 -> Parallel Seq Scan on product_ui_giotto pui_4 (cost=0.00..209.79 rows=1 width=47) (actual time=2.188..2.189 rows=0 loops=1) Filter: (((mission_id)::text = 'urn:esa:psa:context:investigation:mission.em16'::text) AND ((logical_identifier)::text = 'urn:esa:psa:em16_tgo_frd:data_raw:frd_raw_sc_n_20220729t00-20220729t235959'::text)) Rows Removed by Filter: 2129 -> Parallel Seq Scan on product_ui_juice pui_12 (cost=0.00..13.88 rows=1 width=67) (actual time=0.132..0.132 rows=0 loops=1) Filter: (((mission_id)::text = 'urn:esa:psa:context:investigation:mission.em16'::text) AND ((logical_identifier)::text = 'urn:esa:psa:em16_tgo_frd:data_raw:frd_raw_sc_n_20220729t00-20220729t235959'::text)) Rows Removed by Filter: 100 -> Parallel Seq Scan on product_ui_emrsp pui_11 (cost=0.00..10.26 rows=1 width=65) (actual time=0.000..0.000 rows=0 loops=1) Filter: (((mission_id)::text = 'urn:esa:psa:context:investigation:mission.em16'::text) AND ((logical_identifier)::text = 'urn:esa:psa:em16_tgo_frd:data_raw:frd_raw_sc_n_20220729t00-20220729t235959'::text)) -> Parallel Seq Scan on product_ui_hubble pui_5 (cost=0.00..3.29 rows=1 width=62) (actual time=0.047..0.047 rows=0 loops=1) Filter: (((mission_id)::text = 'urn:esa:psa:context:investigation:mission.em16'::text) AND ((logical_identifier)::text = 'urn:esa:psa:em16_tgo_frd:data_raw:frd_raw_sc_n_20220729t00-20220729t235959'::text)) Rows Removed by Filter: 33 Planning Time: 16.874 ms Execution Time: .496 ms Elapsed: 10122 ms Source code of the test is attached in the mail. It provides some other test cases as: - Query other non partitioned table, with correct performance - Query the correct partitioned table according to the partition key column (mission_id). In this case only that table is scanned, but again, using PARALLEL SEQ SCAN and not the INDEX SCAN Could you please provide any hint on the possible reasons of this behavior and the performance degradation that is affecting only the JAVA client. Best regards, Jose Osinde JdbcTest.java Description: Binary data