Re: [PERFORM] autocommit (true/false) for more than 1 million records
Good morning, Trying to insert into one table with 1 million records through java JDBC into psql8.3. May I know (1) or (2) is better please? (1) set autocommit(true) (2) set autocommit(false) commit every n records (e.g., 100, 500, 1000, etc) It depends on what you need. Data will be available to concurrent processes earlier with (1), while (2) will go faster. No need to worry about the lock/loosing records because after data loading will do a check. For now, I'd like the fastest way. Would you suggest commit every 1000 or 3000 records? The improvement drops off pretty quickly in my experience, but it depends on the size of the records and other things. The table is huge with almost 170 columns. Try it and see..? It's almost certainly going to depend on your specific environment. Can you let me know what are the specific environment please? Such as: .. By the way, could someone let me know why set autocommit(false) is for sure faster than true please? Or, some online docs talk about this. Thanks a lot! Emi -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] autocommit (true/false) for more than 1 million records
On Mon, Aug 25, 2014 at 9:40 AM, Emi Lu em...@encs.concordia.ca wrote: By the way, could someone let me know why set autocommit(false) is for sure faster than true please? Or, some online docs talk about this. Not sure about the docs specifically but: Commit is expensive because as soon as it is issued all of the data has to be guaranteed written. While ultimately the same amount of data is guaranteed by doing them in batches there is opportunity to achieve economies of scale. (I think...) When you commit you flush data to disk - until then you can make use of RAM. Once you exhaust RAM you might as well commit and free up that RAM for the next batch. David J.
Re: [PERFORM] autocommit (true/false) for more than 1 million records
Hi Emi, Databases that comply to the ACID standard ( http://en.wikipedia.org/wiki/ACID) ensure that that are no data loss by first writing the data changes to the database log in opposition to updating the actual data on the filesystem first (on the datafiles). Each database has its own way of doing it, but it basically consists of writing the data to the logfile at each COMMIT and writing the data to the datafile only when it's necessary. So the COMMIT command is a way of telling the database to write the data changes to the logfile. Both logfiles and datafiles resides on the filesystem, but why writing to the logfile is faster? It is because the logfile is written sequentially, while the datafile is totally dispersed and may even be fragmented. Resuming: autocommit false is faster because you avoid going to the hard disk to write the changes into the logfile, you keep them in RAM memory until you decide to write them to the logfile (at each 10K rows for instance). Be aware that, eventually, you will need to write data to the logfile, so you can't avoid that. But usually the performance is better if you write X rows at a time to the logfile, rather than writing every and each row one by one (because of the hard disk writing overhead). The number of rows you need to write to get a better performance will depend on your environment and is pretty much done by blind-testing the process. For millions of rows, I usually commit at each 10K or 50K rows. Regards, Felipe 2014-08-25 10:40 GMT-03:00 Emi Lu em...@encs.concordia.ca: Good morning, Trying to insert into one table with 1 million records through java JDBC into psql8.3. May I know (1) or (2) is better please? (1) set autocommit(true) (2) set autocommit(false) commit every n records (e.g., 100, 500, 1000, etc) It depends on what you need. Data will be available to concurrent processes earlier with (1), while (2) will go faster. No need to worry about the lock/loosing records because after data loading will do a check. For now, I'd like the fastest way. Would you suggest commit every 1000 or 3000 records? The improvement drops off pretty quickly in my experience, but it depends on the size of the records and other things. The table is huge with almost 170 columns. Try it and see..? It's almost certainly going to depend on your specific environment. Can you let me know what are the specific environment please? Such as: .. By the way, could someone let me know why set autocommit(false) is for sure faster than true please? Or, some online docs talk about this. Thanks a lot! Emi -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] autocommit (true/false) for more than 1 million records
On Fri, Aug 22, 2014 at 1:49 PM, Emi Lu em...@encs.concordia.ca wrote: Hello, Trying to insert into one table with 1 million records through java JDBC into psql8.3. May I know (1) or (2) is better please? (1) set autocommit(true) (2) set autocommit(false) commit every n records (e.g., 100, 500, 1000, etc) In general it is better to use COPY (however JDBC for 8.3. exposes that), as that is designed specifically for bulk loading. Then it doesn't matter whether autocommit is on or off, because the COPY is a single statement. Cheers, Jeff
[PERFORM] tuning postgresql 9.3.5 and multiple cores
hi, recently i change the hardware of my database 32 cores up to 64 cores and 128GB Ram, but the performance is the same. Perhaps i have to change any parameter in the postgresql.conf?. Thanks by your help -- Atentamente, JEISON BEDOYA DELGADO . -- NOTA VERDE: No imprima este correo a menos que sea absolutamente necesario. Ahorre papel, ayude a salvar un arbol. Este mensaje ha sido analizado por MailScanner en busca de virus y otros contenidos peligrosos, y se considera que esta limpio. Este texto fue anadido por el servidor de correo de Audifarma S.A.: Las opiniones contenidas en este mensaje no necesariamente coinciden con las institucionales de Audifarma. La informacion y todos sus archivos Anexos, son confidenciales, privilegiados y solo pueden ser utilizados por sus destinatarios. Si por error usted recibe este mensaje, le ofrecemos disculpas, solicitamos eliminarlo de inmediato, notificarle de su error a la persona que lo envio y abstenerse de utilizar su contenido. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] From: Rich
Hi pgsql http://activebillion.com/bring.php?fzuvceubqu3101hcvfvcq Rich -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Turn off Hyperthreading! WAS: 60 core performance with 9.3
On 08/22/2014 07:02 AM, Andres Freund wrote: On 2014-08-21 14:02:26 -0700, Josh Berkus wrote: On 08/20/2014 07:40 PM, Bruce Momjian wrote: Not sure how you can make such a blanket statement when so many people have tested and shown the benefits of hyper-threading. Actually, I don't know that anyone has posted the benefits of HT. Link? There's definitely cases where it can help. But it's highly workload *and* hardware dependent. The only cases I've seen where HT can be beneficial is when you have large numbers of idle connections. Then the idle connections can be parked on the HT virtual cores. However, even in this case I haven't seen a head-to-head performance comparison. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] tuning postgresql 9.3.5 and multiple cores
On Monday, August 25, 2014, Jeison Bedoya Delgado jeis...@audifarma.com.co wrote: hi, recently i change the hardware of my database 32 cores up to 64 cores and 128GB Ram, but the performance is the same. Perhaps i have to change any parameter in the postgresql.conf?. PostgreSQL does not (yet) automatically parallelize queries. Unless you have more than 32 queries trying to run at the same time, increasing the number of cores from 32 to 64 is unlikely to be useful. Cheers, Jeff
Re: [PERFORM] Turn off Hyperthreading! WAS: 60 core performance with 9.3
On 26/08/14 10:13, Josh Berkus wrote: On 08/22/2014 07:02 AM, Andres Freund wrote: On 2014-08-21 14:02:26 -0700, Josh Berkus wrote: On 08/20/2014 07:40 PM, Bruce Momjian wrote: Not sure how you can make such a blanket statement when so many people have tested and shown the benefits of hyper-threading. Actually, I don't know that anyone has posted the benefits of HT. Link? There's definitely cases where it can help. But it's highly workload *and* hardware dependent. The only cases I've seen where HT can be beneficial is when you have large numbers of idle connections. Then the idle connections can be parked on the HT virtual cores. However, even in this case I haven't seen a head-to-head performance comparison. I recall HT beneficial on a single socket (i3 or i7), using pgbench as the measuring tool. However I didn't save the results at the time. I've just got some new ssd's to play with so might run some pgbench tests on my home machine (Haswell i7) with HT on and off. Regards Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance