Re: How to improve the performance of COPY commands?

Ilya Kasnacheev Fri, 12 Jul 2019 03:14:36 -0700

Hello!

The recommendation here is to disable WAL before ingesting data in a table.
You can do that by issuing ALTER TABLE tbl NOLOGGING;


After data is loaded, you should turn it back on by ALTER TABLE tbl LOGGING.

Regards,
-- 
Ilya Kasnacheev


пт, 12 июл. 2019 г. в 11:33, liyuj <[email protected]>:

> Hi,
>
> The CSV file is about 250 GB, with about 1 billion rows of data.
> Persistence is on and there is enough memory.
> It has been successfully imported, but it takes a long time.
>
> The problem at present is that the data of this large table is imported
> successfully, and then 50 million tables are imported. The speed of data
> writing is significantly slowed down.
>
> We have four hosts in total. The cache configuration is as follows:
> <property name="backups" value="1"/>
> <property name="partitionLossPolicy" value="READ_ONLY_SAFE"/>
>
> With persistence enabled, the other parameters are nothing special.
>
> 在 2019/7/12 下午1:47, Павлухин Иван 写道:
> > Hi,
> >
> > Currently COPY is a mechanism designed for the fastest data load. Yes,
> > you can try separate your data in chunks and execute COPY in parallel.
> > By the way, where is your input located and what is it size in bytes
> > (Gb)? Is persistence enabled? Does a DataRegion have enough memory to
> > keep all data?
> >
> > ср, 10 июл. 2019 г. в 05:02, 18624049226 <[email protected]>:
> >> If the COPY command is used to import a large amount of data, the
> execution time is a little long.
> >> In the current test environment, the performance is about more than
> 10,000/s, so if it is 100 million data, it will take several hours.
> >>
> >> Is there a faster way to import, or is COPY working in parallel?
> >>
> >> thanks!
> >>
> >
>
>

Re: How to improve the performance of COPY commands?

Reply via email to