Re: [PERFORM] performance problem on big tables

2017-08-17 Thread Mariel Cherkassky
I checked with the storage team in the company and they saw that I have
alot of io on the server. How should I reduce the io that the postgresql
uses ?

2017-08-17 9:25 GMT+03:00 Mariel Cherkassky :

> Hi Daniel,
> I already tried to set the destination table to unlogged - it improved the
> performance slightly. Is there a way to make sure that I/O is the problem ?
>
> 2017-08-17 0:46 GMT+03:00 Daniel Blanch Bataller <
> daniel.blanch.batal...@gmail.com>:
>
>> Seems your disks are too slow. On my laptop (nothing special, just one
>> disk) using COPY I can dump 3G in ~ 20 secs, loading takes 120 secs, bare
>> copying 3G takes 10 secs.
>>
>> Similar proportion you had, but much faster.
>>
>> confirm I/O is your bottleneck, and tell us how you solved your problem
>>
>> Anyway, You can cut import time by half if you set your destination table
>> to unlogged (postgres will write half the data, it will save the
>> transaction log writing). Remember to set it to logged when finished!!
>>
>>
>> Regards,
>>
>> Daniel
>>
>> El 16 ago 2017, a las 16:32, Mariel Cherkassky <
>> mariel.cherkas...@gmail.com> escribió:
>>
>> My server is virtual and it have virtual hd from a vnx storage machine.
>> The logs and the data are on the same disk.
>>
>> 2017-08-16 17:04 GMT+03:00 Daniel Blanch Bataller <
>> daniel.blanch.batal...@gmail.com>:
>>
>>> Considering it has to write logs and data at checkpoints I don’t see it
>>> particularly slow compared to the extract phase. What kind of disks you
>>> have SSD or regular disks? Different disks for ltransaction logs and data?
>>>
>>>
>>> El 16 ago 2017, a las 15:54, Mariel Cherkassky <
>>> mariel.cherkas...@gmail.com> escribió:
>>>
>>> I run the copy command via psql to create a local dump of a 3G table and
>>> it took me 134059.732ms =~2 minutes. After that I imported the data via
>>> copy and it took 458648.677ms =~7 minutes. So the copy command works but
>>> pretty slow.
>>>
>>> 2017-08-16 16:08 GMT+03:00 Daniel Blanch Bataller <
>>> daniel.blanch.batal...@gmail.com>:
>>>
 See if the copy command is actually working, copy should be very fast
 from your local disk.


 El 16 ago 2017, a las 14:26, Mariel Cherkassky <
 mariel.cherkas...@gmail.com> escribió:


 After all the changes of the memory parameters the same
 operation(without the copy utility) didnt run much faster - it  took one
 minute less. I made a test with the copy command (without the 'with
 binary') and it took 1.5 hours to create the dumpfile in my local
 postgresql server. Then I tried to run the copy from the local dump and it
 is already running two hours and it didnt even finish. I looked at the
 server log and I saw that I run the copy command at 13:18:05, 3 minutes
 later checkpoint started and completed and there are no messages in the log
 after that. What can I do ? Improving the memory parameters and the memory
 on the server didnt help and for now the copy command doesnt help either.




 2017-08-15 20:14 GMT+03:00 Scott Marlowe :

> On Tue, Aug 15, 2017 at 4:06 AM, Mariel Cherkassky
>  wrote:
> > Hi,
> > So I I run the cheks that jeff mentioned :
> > \copy (select * from oracle_remote_table) to /tmp/tmp with binary -
> 1 hour
> > and 35 minutes
>
> So 26G takes 95 minutes, or 27 MB/minute or 456k/second? Sound about
> right (it's early, I haven't had enough coffee please check my math).
> That's pretty slow unless you're working across pretty big distances
> with mediocre connections.  My home internet downloads about 100MB/s
> by comparison.
>
> > \copy local_postresql_table from /tmp/tmp with binary - Didnt run
> because
> > the remote oracle database is currently under maintenance work.
>
> You shouldn't need the remote oracle server if you've already copied
> it over, you're just copying from local disk into the local pgsql db.
> Unless I'm missing something.
>
> > So I decided to follow MichaelDBA tips and I set the ram on my
> machine to
> > 16G and I configured the effective_cache memory to
> 14G,tshared_buffer to be
> > 2G and maintenance_work_mem to 4G.
>
> Good settings. Maybe set work_mem to 128MB or so while you're at it.
>
> > I started running the copy checks again and for now it coppied 5G in
> 10
> > minutes. I have some questions :
> > 1)When I run insert into local_postresql_table select * from
> > remote_oracle_table I insert that data as bulk to the local table or
> row by
> > row ?  If the answer as bulk than why copy is a better option for
> this case
> > ?
>
> insert into select from oracle remote is one big copy, but it will
> take at least as long as copying from oracle to the local network
> took. Compare that to the same thing but use file_fdw on the file
> locally.
>
> > 2)The copy from dump into the p

Re: [PERFORM] performance problem on big tables

2017-08-17 Thread Daniel Blanch Bataller
I would just check how does it take to copy 3GB using an standard copy command. 
on my computer it took 10 secs. 


> El 17 ago 2017, a las 11:00, Mariel Cherkassky  
> escribió:
> 
> I checked with the storage team in the company and they saw that I have alot 
> of io on the server. How should I reduce the io that the postgresql uses ?
> 
> 2017-08-17 9:25 GMT+03:00 Mariel Cherkassky  >:
> Hi Daniel,
> I already tried to set the destination table to unlogged - it improved the 
> performance slightly. Is there a way to make sure that I/O is the problem ? 
> 
> 2017-08-17 0:46 GMT+03:00 Daniel Blanch Bataller 
> mailto:daniel.blanch.batal...@gmail.com>>:
> Seems your disks are too slow. On my laptop (nothing special, just one disk) 
> using COPY I can dump 3G in ~ 20 secs, loading takes 120 secs, bare copying 
> 3G takes 10 secs. 
> 
> Similar proportion you had, but much faster. 
> 
> confirm I/O is your bottleneck, and tell us how you solved your problem
> 
> Anyway, You can cut import time by half if you set your destination table to 
> unlogged (postgres will write half the data, it will save the transaction log 
> writing). Remember to set it to logged when finished!!
> 
> 
> Regards,
> 
> Daniel
> 
>> El 16 ago 2017, a las 16:32, Mariel Cherkassky > > escribió:
>> 
>> My server is virtual and it have virtual hd from a vnx storage machine. The 
>> logs and the data are on the same disk.
>> 
>> 2017-08-16 17:04 GMT+03:00 Daniel Blanch Bataller 
>> mailto:daniel.blanch.batal...@gmail.com>>:
>> Considering it has to write logs and data at checkpoints I don’t see it 
>> particularly slow compared to the extract phase. What kind of disks you have 
>> SSD or regular disks? Different disks for ltransaction logs and data?
>> 
>> 
>>> El 16 ago 2017, a las 15:54, Mariel Cherkassky >> > escribió:
>>> 
>>> I run the copy command via psql to create a local dump of a 3G table and it 
>>> took me 134059.732ms =~2 minutes. After that I imported the data via copy 
>>> and it took 458648.677ms =~7 minutes. So the copy command works but pretty 
>>> slow. 
>>> 
>>> 2017-08-16 16:08 GMT+03:00 Daniel Blanch Bataller 
>>> >> >:
>>> See if the copy command is actually working, copy should be very fast from 
>>> your local disk.
>>> 
>>> 
 El 16 ago 2017, a las 14:26, Mariel Cherkassky 
 mailto:mariel.cherkas...@gmail.com>> 
 escribió:
 
 
 After all the changes of the memory parameters the same operation(without 
 the copy utility) didnt run much faster - it  took one minute less. I made 
 a test with the copy command (without the 'with binary') and it took 1.5 
 hours to create the dumpfile in my local postgresql server. Then I tried 
 to run the copy from the local dump and it is already running two hours 
 and it didnt even finish. I looked at the server log and I saw that I run 
 the copy command at 13:18:05, 3 minutes later checkpoint started and 
 completed and there are no messages in the log after that. What can I do ? 
 Improving the memory parameters and the memory on the server didnt help 
 and for now the copy command doesnt help either.
 
 
 
 
 2017-08-15 20:14 GMT+03:00 Scott Marlowe >>> >:
 On Tue, Aug 15, 2017 at 4:06 AM, Mariel Cherkassky
 mailto:mariel.cherkas...@gmail.com>> wrote:
 > Hi,
 > So I I run the cheks that jeff mentioned :
 > \copy (select * from oracle_remote_table) to /tmp/tmp with binary - 1 
 > hour
 > and 35 minutes
 
 So 26G takes 95 minutes, or 27 MB/minute or 456k/second? Sound about
 right (it's early, I haven't had enough coffee please check my math).
 That's pretty slow unless you're working across pretty big distances
 with mediocre connections.  My home internet downloads about 100MB/s
 by comparison.
 
 > \copy local_postresql_table from /tmp/tmp with binary - Didnt run because
 > the remote oracle database is currently under maintenance work.
 
 You shouldn't need the remote oracle server if you've already copied
 it over, you're just copying from local disk into the local pgsql db.
 Unless I'm missing something.
 
 > So I decided to follow MichaelDBA tips and I set the ram on my machine to
 > 16G and I configured the effective_cache memory to 14G,tshared_buffer to 
 > be
 > 2G and maintenance_work_mem to 4G.
 
 Good settings. Maybe set work_mem to 128MB or so while you're at it.
 
 > I started running the copy checks again and for now it coppied 5G in 10
 > minutes. I have some questions :
 > 1)When I run insert into local_postresql_table select * from
 > remote_oracle_table I insert that data as bulk to the local table or row 
 > by
 > row ?  If the answer as bulk than why copy is a better

Re: [PERFORM] performance problem on big tables

2017-08-17 Thread Claudio Freire
On Thu, Aug 17, 2017 at 6:00 AM, Mariel Cherkassky
 wrote:
> I checked with the storage team in the company and they saw that I have alot
> of io on the server. How should I reduce the io that the postgresql uses ?

Do you have concurrent activity on that server?

What filesystem are you using wherever the data is sitting?

If you've got concurrent fsyncs happening, some filesystems handle
that poorly. When you've got WAL and data mixed in a single disk, or
worse, filesystem, it happens often that the filesystem won't handle
the write barriers for the WAL efficiently. I/O gets intermingled with
bulk operations, and even small fsyncs will have to flush writes from
bulk operations, which makes a mess of things.

It is a very good idea, and in fact a recommended practice, to put WAL
on its own disk for that reason mainly.

With that little RAM, you'll also probably cause a lot of I/O in temp
files, so I'd also recommend setting aside another disk for a temp
tablespace so that I/O doesn't block other transactions as well.

This is all assuming you've got concurrent activity on the server. If
not, install iotop and try to see who's causing that much I/O.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance