Re: [HACKERS] Load distributed checkpoint

2007-02-26 Thread Inaam Rana

On 12/19/06, ITAGAKI Takahiro [EMAIL PROTECTED] wrote:


Takayuki Tsunakawa [EMAIL PROTECTED] wrote:

 I performed some simple tests, and I'll show the results below.

 (1) The default case
 235  80  226 77  240
 (2) No write case
 242  250  244  253  280
 (3) No checkpoint case
 229  252  256  292  276
 (4) No fsync() case
 236  112  215  216  221
 (5) No write by PostgreSQL, but fsync() by another program case
 9  223  260  283  292
 (6) case (5) + O_SYNC by write_fsync
 97  114  126  112  125
 (7) O_SYNC case
 182  103  41  50  74

I posted a patch to PATCHES. Please try out it.
It does write() smoothly, but fsync() at a burst.
I suppose the result will be between (3) and (5).




Itagaki,

Did you had a chance to look into this any further? We, at EnterpriseDB,
have done some testing on this patch (dbt2 runs) and it looks like we are
getting the desired results, particularly so when we spread out both sync
and write phases.

--
Inaam Rana
EnterpriseDB   http://www.enterprisedb.com


Re: [PATCHES] [HACKERS] Load distributed checkpoint

2007-02-26 Thread Inaam Rana

On 2/26/07, ITAGAKI Takahiro [EMAIL PROTECTED] wrote:


Josh Berkus josh@agliodbs.com wrote:

 Can I have a copy of the patch to add to the Sun testing queue?

This is the revised version of the patch. Delay factors in checkpoints
can be specified by checkpoint_write_percent, checkpoint_nap_percent
and checkpoint_sync_percent. They are relative to checkpoint_timeout.

Also, checking of archive_timeout during checkpoints and some error
handling routines were added.




One of the issues we had during testing with original patch was db stop not
working properly. I think you coded something to do a stop checkpoint in
immediately but if a checkpoint is already in progress at that time, it
would take its own time to complete.
Does this patch resolve that issue? Also, is it based on pg82stable or HEAD?

regards,
inaam

Regards,

---
ITAGAKI Takahiro
NTT Open Source Software Center



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings






--
Inaam Rana
EnterpriseDB   http://www.enterprisedb.com


Re: [HACKERS] Load distributed checkpoint

2007-01-11 Thread Inaam Rana


No, I've not tried yet.  Inaam-san told me that Linux had a few I/O
schedulers but I'm not familiar with them.  I'll find information
about them (how to change the scheduler settings) and try the same
test.



I am sorry, your response just slipped by me. The docs for RHEL (I believe
you are running RHEL which has 2.6.9 kernel) say that it does support
selectable IO scheduler.

http://www.redhat.com/rhel/details/limits/

I am not sure where else to look for scheduler apart from /sys

regards,
inaam


Re: [HACKERS] Load distributed checkpoint

2006-12-22 Thread Inaam Rana

On 12/22/06, Takayuki Tsunakawa [EMAIL PROTECTED] wrote:


 From: Inaam Rana
 Which IO Shceduler (elevator) you are using?

Elevator?  Sorry, I'm not familiar with the kernel implementation, so I
don't what it is.  My Linux distribution is Red Hat Enterprise Linux 4.0for 
AMD64/EM64T, and the kernel is
2.6.9-42.ELsmp.  I probably havn't changed any kernel settings, except for
IPC settings to run PostgreSQL.



There are four IO schedulers in Linux. Anticipatory, CFQ (default),
deadline, and noop. For typical OLTP type loads generally deadline is
recommended. If you are constrained on CPU and you have a good controller
then its better to use noop.
Deadline attempts to merge requests by maintaining two red black trees in
sector sort order and it also ensures that a request is serviced in given
time by using FIFO. I don't expect it to do the magic but was wondering that
it may dilute the issue of fsync() elbowing out WAL writes.

You can look into /sys/block/device/queue/scheduler to see which scheduler
you are using.

regards,
inaam


--
Inaam Rana
EnterpriseDB   http://www.enterprisedb.com


Re: [HACKERS] Load distributed checkpoint

2006-12-21 Thread Inaam Rana

On 12/22/06, Takayuki Tsunakawa [EMAIL PROTECTED] wrote:


From: Takayuki Tsunakawa [EMAIL PROTECTED]
 (5) (4) + /proc/sys/vm/dirty* tuning
 dirty_background_ratio is changed from 10 to 1, and dirty_ratio is
 changed from 40 to 4.

 308  349  84  349  84

Sorry, I forgot to include the result when using Itagaki-san's patch.
The patch showd the following tps for case (5).

323  350  340  59  225

The best response time was 4 msec, and the worst one was 16 seconds.




Which IO Shceduler (elevator) you are using?

--
Inaam Rana
EnterpriseDB   http://www.enterprisedb.com


Re: [HACKERS] Load distributed checkpoint

2006-12-20 Thread Inaam Rana

On 12/20/06, Takayuki Tsunakawa [EMAIL PROTECTED] wrote:


[Conclusion]
I believe that the problem cannot be solved in a real sense by
avoiding fsync/fdatasync().  We can't ignore what commercial databases
have done so far.  The kernel does as much as he likes when PostgreSQL
requests him to fsync().



I am new to the community and am very interested in the tests that you have
done. I am also working on resolving the sudden IO spikes at checkpoint
time. I agree with you that fsync() is the core issue here.

Being a new member I was wondering if someone on this list has done testing
with O_DIRECT and/or O_SYNC for datafiles as that seems to be the most
logical way of dealing with fsync() flood at checkpoint time. If so, I'll be
very interested in the results. As mentioned in this thread that a single
bgwriter with O_DIRECT will not be able to keep pace with cleaning effort
causing backend writes. I think (i.e. IMHO) multiple bgwriters and/or
AsyncIO with O_DIRECT can resolve this issue.

Talking of bgwriter_* parameters I think we are missing a crucial internal
counter i.e. number of dirty pages. How much work bgwriter has to do at each
wakeup call should be a function of total buffers and currently dirty
buffers. Relying on both these values instead of just one static NBuffers
should allow bgwriter to adapt more quickly to workload changes and ensure
that not much work is accumulated for checkpoint.

--
Inaam Rana
EnterpriseDB   http://www.enterprisedb.com


Re: [HACKERS] Load distributed checkpoint

2006-12-11 Thread Inaam Rana



I wonder how the other big DBMS, IBM DB2, handles this. Is Itagaki-san
referring to DB2?



DB2 would also open data files with O_SYNC option and page_cleaners
(counterparts of bgwriter) would exploit AIO if available on the system.

Inaam Rana
EnterpriseDB   http://www.enterprisedb.com