Re: [PERFORM] performance problem on big tables
2017-08-15 18:13 GMT+02:00 Jeff Janes: > On Tue, Aug 15, 2017 at 3:06 AM, Mariel Cherkassky < > mariel.cherkas...@gmail.com> wrote: > >> Hi, >> So I I run the cheks that jeff mentioned : >> \copy (select * from oracle_remote_table) to /tmp/tmp with binary - 1 >> hour and 35 minutes >> \copy local_postresql_table from /tmp/tmp with binary - Didnt run because >> the remote oracle database is currently under maintenance work. >> > > The "\copy...from" doesn't depend on oracle, it would be only depend on > local file system (/tmp/tmp), provided that the "\copy...to" finished. > Anyway, given the length of time it took, I think you can conclude the > bottleneck is in oracle_fdw itself, or in Oracle, or the network. > dumping from Oracle is not fast - I seen it when oracle_fdw or ora2pg cases. Regards Pavel > > Cheers, > > Jeff >
Re: [PERFORM] performance problem on big tables
On Tue, Aug 15, 2017 at 4:06 AM, Mariel Cherkasskywrote: > Hi, > So I I run the cheks that jeff mentioned : > \copy (select * from oracle_remote_table) to /tmp/tmp with binary - 1 hour > and 35 minutes So 26G takes 95 minutes, or 27 MB/minute or 456k/second? Sound about right (it's early, I haven't had enough coffee please check my math). That's pretty slow unless you're working across pretty big distances with mediocre connections. My home internet downloads about 100MB/s by comparison. > \copy local_postresql_table from /tmp/tmp with binary - Didnt run because > the remote oracle database is currently under maintenance work. You shouldn't need the remote oracle server if you've already copied it over, you're just copying from local disk into the local pgsql db. Unless I'm missing something. > So I decided to follow MichaelDBA tips and I set the ram on my machine to > 16G and I configured the effective_cache memory to 14G,tshared_buffer to be > 2G and maintenance_work_mem to 4G. Good settings. Maybe set work_mem to 128MB or so while you're at it. > I started running the copy checks again and for now it coppied 5G in 10 > minutes. I have some questions : > 1)When I run insert into local_postresql_table select * from > remote_oracle_table I insert that data as bulk to the local table or row by > row ? If the answer as bulk than why copy is a better option for this case > ? insert into select from oracle remote is one big copy, but it will take at least as long as copying from oracle to the local network took. Compare that to the same thing but use file_fdw on the file locally. > 2)The copy from dump into the postgresql database should take less time than > the copy to dump ? Yes. The copy from Oracle to your local drive is painfully slow for a modern network connection. > 3)What do you think about the new memory parameters that I cofigured ? They should be OK. I'm more worried about the performance of the io subsystem tbh. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Odd sudden performance degradation related to temp object churn
On Mon, Aug 14, 2017 at 5:10 PM, Jerry Sieverswrote: > Peter Geoghegan writes: > >> On Mon, Aug 14, 2017 at 12:53 PM, Jeremy Finzel wrote: >> >>> This particular db is on 9.3.15. Recently we had a serious performance >>> degradation related to a batch job that creates 4-5 temp tables and 5 >>> indexes. It is a really badly written job but what really confuses us is >>> that this job has been running for years with no issue remotely approaching >>> this one. We are also using pgpool. >> >> Did you happen to notice that this occurred when you upgrading point >> release? If so, what version did you move from/to? > > The system was last started back in November. Running 9.3.15. > > Not aware of any host system libs or whatever change recently but will > investigate. So do iostat or iotop show you if / where your disks are working hardest? Or is this CPU overhead that's killing performance? -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Odd sudden performance degradation related to temp object churn
Oh yeah, sorry. Was looking at a different system where we were using a tablespace for temp tables. On Tue, Aug 15, 2017 at 10:00 AM, Jeremy Finzelwrote: >> > Not so. >> > >> > This system has no defined temp_tablespace however spillage due to >> > sorting/hashing that exceeds work_mem goes to base/pgsql_tmp which we >> > have symlinked out to a local SSD drive. >> >> Which is also where temp tables are created. > > > This isn't true, at least in our environment. Just as proof, I have created > a couple of temp tables, and querying the relfilenodes, they only show up > under base//t4_: > > test=# CREATE TEMP TABLE foo(id int); > CREATE TABLE > test=# INSERT INTO foo SELECT * FROM generate_series(1,100); > INSERT 0 100 > test=# CREATE TEMP TABLE bar(); > CREATE TABLE > test=# SELECT relfilenode FROM pg_class WHERE relname IN('foo','bar'); > relfilenode > - >20941 >20944 > (2 rows) > > postgres@foo:/san//pgdata/base$ ls -l > total 44 > drwx-- 2 postgres postgres 4096 Jul 7 15:19 1 > drwx-- 2 postgres postgres 4096 Nov 29 2016 12408 > drwx-- 2 postgres postgres 4096 Jul 14 14:00 12409 > drwx-- 2 postgres postgres 12288 Jul 7 15:19 18289 > drwx-- 2 postgres postgres 12288 Jul 7 15:19 18803 > drwx-- 2 postgres postgres 4096 Jul 7 15:19 20613 > drwx-- 2 postgres postgres 4096 Aug 15 08:06 20886 > lrwxrwxrwx 1 postgres postgres30 Jul 7 15:15 pgsql_tmp -> > /local/pgsql_tmp/9.6/ > > postgres@pgsnap05:/san//pgdata/base$ ls -l 20886 | grep > '20941\|20944' > -rw--- 1 postgres postgres 8192 Aug 15 10:55 t4_20941 > -rw--- 1 postgres postgres 0 Aug 15 10:55 t4_20944 > postgres@pgsnap05:/san/dba_dev_d/pgdata/base$ cd pgsql_tmp > postgres@pgsnap05:/san/dba_dev_d/pgdata/base/pgsql_tmp$ ls -l > total 0 -- To understand recursion, one must first understand recursion. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] performance problem on big tables
On Tue, Aug 15, 2017 at 3:06 AM, Mariel Cherkassky < mariel.cherkas...@gmail.com> wrote: > Hi, > So I I run the cheks that jeff mentioned : > \copy (select * from oracle_remote_table) to /tmp/tmp with binary - 1 hour > and 35 minutes > \copy local_postresql_table from /tmp/tmp with binary - Didnt run because > the remote oracle database is currently under maintenance work. > The "\copy...from" doesn't depend on oracle, it would be only depend on local file system (/tmp/tmp), provided that the "\copy...to" finished. Anyway, given the length of time it took, I think you can conclude the bottleneck is in oracle_fdw itself, or in Oracle, or the network. Cheers, Jeff
Re: [PERFORM] Odd sudden performance degradation related to temp object churn
> > > Not so. > > > > This system has no defined temp_tablespace however spillage due to > > sorting/hashing that exceeds work_mem goes to base/pgsql_tmp which we > > have symlinked out to a local SSD drive. > > Which is also where temp tables are created. > This isn't true, at least in our environment. Just as proof, I have created a couple of temp tables, and querying the relfilenodes, they only show up under base//t4_: test=# CREATE TEMP TABLE foo(id int); CREATE TABLE test=# INSERT INTO foo SELECT * FROM generate_series(1,100); INSERT 0 100 test=# CREATE TEMP TABLE bar(); CREATE TABLE test=# SELECT relfilenode FROM pg_class WHERE relname IN('foo','bar'); relfilenode - 20941 20944 (2 rows) postgres@foo:/san//pgdata/base$ ls -l total 44 drwx-- 2 postgres postgres 4096 Jul 7 15:19 1 drwx-- 2 postgres postgres 4096 Nov 29 2016 12408 drwx-- 2 postgres postgres 4096 Jul 14 14:00 12409 drwx-- 2 postgres postgres 12288 Jul 7 15:19 18289 drwx-- 2 postgres postgres 12288 Jul 7 15:19 18803 drwx-- 2 postgres postgres 4096 Jul 7 15:19 20613 drwx-- 2 postgres postgres 4096 Aug 15 08:06 20886 lrwxrwxrwx 1 postgres postgres30 Jul 7 15:15 pgsql_tmp -> /local/pgsql_tmp/9.6/ postgres@pgsnap05:/san//pgdata/base$ ls -l 20886 | grep '20941\|20944' -rw--- 1 postgres postgres 8192 Aug 15 10:55 t4_20941 -rw--- 1 postgres postgres 0 Aug 15 10:55 t4_20944 postgres@pgsnap05:/san/dba_dev_d/pgdata/base$ cd pgsql_tmp postgres@pgsnap05:/san/dba_dev_d/pgdata/base/pgsql_tmp$ ls -l total 0
Re: [PERFORM] Odd sudden performance degradation related to temp object churn
On Mon, Aug 14, 2017 at 4:16 PM, Jerry Sieverswrote: > Scott Marlowe writes: > >> You do know that temp tables go into the default temp table space, >> just like sorts, right? > > Not so. > > This system has no defined temp_tablespace however spillage due to > sorting/hashing that exceeds work_mem goes to base/pgsql_tmp which we > have symlinked out to a local SSD drive. Which is also where temp tables are created. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] performance problem on big tables
Hi, So I I run the cheks that jeff mentioned : \copy (select * from oracle_remote_table) to /tmp/tmp with binary - 1 hour and 35 minutes \copy local_postresql_table from /tmp/tmp with binary - Didnt run because the remote oracle database is currently under maintenance work. So I decided to follow MichaelDBA tips and I set the ram on my machine to 16G and I configured the effective_cache memory to 14G,tshared_buffer to be 2G and maintenance_work_mem to 4G. I started running the copy checks again and for now it coppied 5G in 10 minutes. I have some questions : 1)When I run insert into local_postresql_table select * from remote_oracle_table I insert that data as bulk to the local table or row by row ? If the answer as bulk than why copy is a better option for this case ? 2)The copy from dump into the postgresql database should take less time than the copy to dump ? 3)What do you think about the new memory parameters that I cofigured ? 2017-08-14 16:24 GMT+03:00 Mariel Cherkassky: > I have performance issues with two big tables. Those tables are located on > an oracle remote database. I'm running the quert : insert into > local_postgresql_table select * from oracle_remote_table. > > The first table has 45M records and its size is 23G. The import of the > data from the oracle remote database is taking 1 hour and 38 minutes. After > that I create 13 regular indexes on the table and it takes 10 minutes per > table ->2 hours and 10 minutes in total. > > The second table has 29M records and its size is 26G. The import of the > data from the oracle remote database is taking 2 hours and 30 minutes. The > creation of the indexes takes 1 hours and 30 minutes (some are indexes on > one column and the creation takes 5 min and some are indexes on multiples > column and it takes 11 min. > > Those operation are very problematic for me and I'm searching for a > solution to improve the performance. The parameters I assigned : > > min_parallel_relation_size = 200MB > max_parallel_workers_per_gather = 5 > max_worker_processes = 8 > effective_cache_size = 2500MB > work_mem = 16MB > maintenance_work_mem = 1500MB > shared_buffers = 2000MB > RAM : 5G > CPU CORES : 8 > > *-I tried running select count(*) from table in oracle and in postgresql > the running time is almost equal.* > > *-Before importing the data I drop the indexes and the constraints.* > > *-I tried to copy a 23G file from the oracle server to the postgresql > server and it took me 12 minutes.* > > Please advice how can I continue ? How can I improve something in this > operation ? >