Re: [PERFORM] autocommit (true/false) for more than 1 million records

2014-08-25 Thread Emi Lu

Good morning,

Trying to insert into one table with 1 million records through java
JDBC into psql8.3. May I know (1) or (2) is better please?

(1) set autocommit(true)
(2) set autocommit(false)
  commit every n records (e.g., 100, 500, 1000, etc)

It depends on what you need.

Data will be available to concurrent processes earlier with (1), while
(2) will go faster.

No need to worry about the lock/loosing records because after data
loading will do a check. For now, I'd like the fastest way. Would
you suggest commit every 1000 or 3000 records?

The improvement drops off pretty quickly in my experience, but it
depends on the size of the records and other things.

The table is huge with almost 170 columns.


Try it and see..?  It's almost certainly going to depend on your
specific environment.
Can you let me know what are the specific environment please? Such as: 
..


By the way, could someone let me know why set autocommit(false) is for 
sure faster than true please? Or, some online docs talk about this.


Thanks a lot!
Emi



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] autocommit (true/false) for more than 1 million records

2014-08-25 Thread David Johnston
On Mon, Aug 25, 2014 at 9:40 AM, Emi Lu em...@encs.concordia.ca wrote:


 By the way, could someone let me know why set autocommit(false) is for
 sure faster than true please? Or, some online docs talk about this.


​Not sure about the docs specifically but:

Commit is expensive because as soon as it is issued all of the data has to
be guaranteed written.  ​While ultimately the same amount of data is
guaranteed by doing them in batches there is opportunity to achieve
economies of scale.

(I think...)
When you commit you flush data to disk - until then you can make use of
RAM.  Once you exhaust RAM you might as well commit and free up that RAM
for the next batch.

David J.


Re: [PERFORM] autocommit (true/false) for more than 1 million records

2014-08-25 Thread Felipe Santos
Hi Emi,

Databases that comply to the ACID standard (
http://en.wikipedia.org/wiki/ACID) ensure that that are no data loss by
first writing the data changes to the database log in opposition to
updating the actual data on the filesystem first (on the datafiles).

Each database has its own way of doing it, but it basically consists of
writing the data to the logfile at each COMMIT and writing the data to the
datafile only when it's necessary.

So the COMMIT command is a way of telling the database to write the data
changes to the logfile.

Both logfiles and datafiles resides on the filesystem, but why writing to
the logfile is faster?

It is because the logfile is written sequentially, while the datafile is
totally dispersed and may even be fragmented.

Resuming: autocommit false is faster because you avoid going to the hard
disk to write the changes into the logfile, you keep them in RAM memory
until you decide to write them to the logfile (at each 10K rows for
instance).

Be aware that, eventually, you will need to write data to the logfile, so
you can't avoid that. But usually the performance is better if you write X
rows at a time to the logfile, rather than writing every and each row one
by one (because of the hard disk writing overhead).

The number of rows you need to write to get a better performance will
depend on your environment and is pretty much done by blind-testing the
process. For millions of rows, I usually commit at each 10K or 50K rows.

Regards,

Felipe




2014-08-25 10:40 GMT-03:00 Emi Lu em...@encs.concordia.ca:

 Good morning,

 Trying to insert into one table with 1 million records through java
 JDBC into psql8.3. May I know (1) or (2) is better please?

 (1) set autocommit(true)
 (2) set autocommit(false)
   commit every n records (e.g., 100, 500, 1000, etc)

 It depends on what you need.

 Data will be available to concurrent processes earlier with (1), while
 (2) will go faster.

 No need to worry about the lock/loosing records because after data
 loading will do a check. For now, I'd like the fastest way. Would
 you suggest commit every 1000 or 3000 records?

 The improvement drops off pretty quickly in my experience, but it
 depends on the size of the records and other things.

 The table is huge with almost 170 columns.

  Try it and see..?  It's almost certainly going to depend on your
 specific environment.

 Can you let me know what are the specific environment please? Such as:
 ..

 By the way, could someone let me know why set autocommit(false) is for
 sure faster than true please? Or, some online docs talk about this.

 Thanks a lot!
 Emi



 --
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance



Re: [PERFORM] autocommit (true/false) for more than 1 million records

2014-08-25 Thread Jeff Janes
On Fri, Aug 22, 2014 at 1:49 PM, Emi Lu em...@encs.concordia.ca wrote:

 Hello,

 Trying to insert into one table with 1 million records through java JDBC
 into psql8.3. May I know (1) or (2) is better please?

 (1) set autocommit(true)
 (2) set autocommit(false)
  commit every n records (e.g., 100, 500, 1000, etc)


In general it is better to use COPY (however JDBC for 8.3. exposes that),
as that is designed specifically for bulk loading.

Then it doesn't matter whether autocommit is on or off, because the COPY is
a single statement.

Cheers,

Jeff


[PERFORM] tuning postgresql 9.3.5 and multiple cores

2014-08-25 Thread Jeison Bedoya Delgado
hi, recently i change the hardware of my database 32 cores up to 64 
cores and 128GB Ram, but the performance is the same.  Perhaps i have to 
change any parameter in the postgresql.conf?.


Thanks by your help

--
Atentamente,


JEISON BEDOYA DELGADO
.


--
NOTA VERDE:
No imprima este correo a menos que sea absolutamente necesario.
Ahorre papel, ayude a salvar un arbol.


Este mensaje ha sido analizado por MailScanner
en busca de virus y otros contenidos peligrosos,
y se considera que esta limpio.


Este texto fue anadido por el servidor de correo de Audifarma S.A.:

Las opiniones contenidas en este mensaje no necesariamente coinciden
con las institucionales de Audifarma. La informacion y todos sus
archivos Anexos, son confidenciales, privilegiados y solo pueden ser
utilizados por sus destinatarios. Si por error usted recibe este
mensaje, le ofrecemos disculpas, solicitamos eliminarlo de inmediato,
notificarle de su error a la persona que lo envio y abstenerse de
utilizar su contenido.




--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] From: Rich

2014-08-25 Thread Rich
Hi pgsql


http://activebillion.com/bring.php?fzuvceubqu3101hcvfvcq





Rich


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Turn off Hyperthreading! WAS: 60 core performance with 9.3

2014-08-25 Thread Josh Berkus
On 08/22/2014 07:02 AM, Andres Freund wrote:
 On 2014-08-21 14:02:26 -0700, Josh Berkus wrote:
 On 08/20/2014 07:40 PM, Bruce Momjian wrote:
 Not sure how you can make such a blanket statement when so many people
 have tested and shown the benefits of hyper-threading.  

 Actually, I don't know that anyone has posted the benefits of HT.
 Link?
 
 There's definitely cases where it can help. But it's highly workload
 *and* hardware dependent.

The only cases I've seen where HT can be beneficial is when you have
large numbers of idle connections.  Then the idle connections can be
parked on the HT virtual cores.  However, even in this case I haven't
seen a head-to-head performance comparison.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] tuning postgresql 9.3.5 and multiple cores

2014-08-25 Thread Jeff Janes
On Monday, August 25, 2014, Jeison Bedoya Delgado jeis...@audifarma.com.co
wrote:

 hi, recently i change the hardware of my database 32 cores up to 64 cores
 and 128GB Ram, but the performance is the same.  Perhaps i have to change
 any parameter in the postgresql.conf?.



PostgreSQL does not (yet) automatically parallelize queries.

Unless you have more than 32 queries trying to run at the same time,
increasing the number of cores from 32 to 64 is unlikely to be useful.

Cheers,

Jeff


Re: [PERFORM] Turn off Hyperthreading! WAS: 60 core performance with 9.3

2014-08-25 Thread Mark Kirkwood

On 26/08/14 10:13, Josh Berkus wrote:

On 08/22/2014 07:02 AM, Andres Freund wrote:

On 2014-08-21 14:02:26 -0700, Josh Berkus wrote:

On 08/20/2014 07:40 PM, Bruce Momjian wrote:

Not sure how you can make such a blanket statement when so many people
have tested and shown the benefits of hyper-threading.


Actually, I don't know that anyone has posted the benefits of HT.
Link?


There's definitely cases where it can help. But it's highly workload
*and* hardware dependent.


The only cases I've seen where HT can be beneficial is when you have
large numbers of idle connections.  Then the idle connections can be
parked on the HT virtual cores.  However, even in this case I haven't
seen a head-to-head performance comparison.



I recall HT beneficial on a single socket (i3 or i7), using pgbench as 
the measuring tool. However I didn't save the results at the time. I've 
just got some new ssd's to play with so might run some pgbench tests on 
my home machine (Haswell i7) with HT on and off.


Regards

Mark


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance