Re: [External] Multiple COPY on the same table

2018-08-20 Thread Ravi Krishna
1. The tables has no indexes at the time of load.2.  The create table and copy 
are in the same transaction.
So I guess that's pretty much it.  I understand the long time it takes as some 
of the tables have 400+ million rows.Also the env is a container and since this 
is currently a POC system , not much time has been invested in fine tuning it.
thanks all.

Re: [External] Multiple COPY on the same table

2018-08-20 Thread Ron

Maybe he just has a large file that needs to be loaded into a table...

On 08/20/2018 11:47 AM, Vijaykumar Jain wrote:

Hey Ravi,

What is the goal you are trying to achieve here.
To make pgdump/restore faster?
To make replication faster?
To make backup faster ?

Also no matter how small you split the files into, if network is your 
bottleneck then I am not sure you can attain n times the benefit my simply 
sending the files in parallel but yeah maybe some benefit.
But then for parallel processing you also need to ensure your server is 
having relevant resources or else it will just be a lot of context 
switching I guess ?

Pg dump has an option to dump in parallel
pgbasebackup is single threaded I read but pgbackrest can allow better 
parallel processing in backups.
There is also logical replication where you can selectively replicate your 
tables to avoid bandwidth issues.
I might have said a lot and nothing may be relevant, but you need to let 
us know the goal you want to achieve :)


Regards,
Vijay

*From:* Ravi Krishna 
*Sent:* Monday, August 20, 2018 8:24:35 PM
*To:* pgsql-general@lists.postgresql.org
*Subject:* [External] Multiple COPY on the same table
Can I split a large file into multiple files and then run copy using each 
file.  The table does not contain any
serial or sequence column which may need serialization. Let us say I split 
a large file to 4 files.  Will the

performance boost by close to 4x??

ps: Pls ignore my previous post which was without a subject (due to mistake)


--
Angular momentum makes the world go 'round.


Re: [External] Multiple COPY on the same table

2018-08-20 Thread Christopher Browne
On Mon, 20 Aug 2018 at 12:53, Ravi Krishna  wrote:

> > What is the goal you are trying to achieve here.
> > To make pgdump/restore faster?
> > To make replication faster?
> > To make backup faster ?
>
> None of the above.
>
>  We got csv files from external vendor which are 880GB in total size, in 44 
> files.  Some of the large tables had COPY running for several hours. I was 
> just thinking of a faster way to load.


Seems like #4...

#4 - To Make Recovery faster

Using COPY pretty much *is* the "faster way to load"...

The main thing you should consider doing to make it faster is to drop
indexes and foreign keys from the tables, and recreate them
afterwards.
-- 
When confronted by a difficult problem, solve it by reducing it to the
question, "How would the Lone Ranger handle this?"



Re: [External] Multiple COPY on the same table

2018-08-20 Thread Ravi Krishna


> What is the goal you are trying to achieve here.
> To make pgdump/restore faster?
> To make replication faster?
> To make backup faster ?

None of the above.
 We got csv files from external vendor which are 880GB in total size, in 44 
files.  Some of the large tables had COPY running for several hours. I was just 
thinking of a faster way to load.


Re: [External] Multiple COPY on the same table

2018-08-20 Thread Vijaykumar Jain
Hey Ravi,

What is the goal you are trying to achieve here.
To make pgdump/restore faster?
To make replication faster?
To make backup faster ?

Also no matter how small you split the files into, if network is your 
bottleneck then I am not sure you can attain n times the benefit my simply 
sending the files in parallel but yeah maybe some benefit.
But then for parallel processing you also need to ensure your server is having 
relevant resources or else it will just be a lot of context switching I guess ?
Pg dump has an option to dump in parallel
pgbasebackup is single threaded I read but pgbackrest can allow better parallel 
processing in backups.
There is also logical replication where you can selectively replicate your 
tables to avoid bandwidth issues.
I might have said a lot and nothing may be relevant, but you need to let us 
know the goal you want to achieve :)

Regards,
Vijay

From: Ravi Krishna 
Sent: Monday, August 20, 2018 8:24:35 PM
To: pgsql-general@lists.postgresql.org
Subject: [External] Multiple COPY on the same table

Can I split a large file into multiple files and then run copy using each file. 
 The table does not contain any
serial or sequence column which may need serialization. Let us say I split a 
large file to 4 files.  Will the
performance boost by close to 4x??

ps: Pls ignore my previous post which was without a subject (due to mistake)