[ 
https://issues.apache.org/jira/browse/IGNITE-6339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16281803#comment-16281803
 ] 

Andrey Gura edited comment on IGNITE-6339 at 12/7/17 12:52 PM:
---------------------------------------------------------------

As solution the following are implemented:

- Ring buffer (see {{SegmentedRingByteBuffer}}) that reserve segment (buffer 
slice) for thread that want write WAL records. Thread serializes WAL records to 
the segment and calculate CRC if enabled. As result all threads serialize 
record in parallel manner to the ring buffer.
- Dedicated single thread that handles fsync requests and flushes data from the 
ring buffer to the file channel. This thread can't be interrupted via public 
API in distinction of current implementation. Threads that initiate fsync will 
be parked until data will be flushed to disk. 

This solution solves problem with interrupts and improve WAL write performance 
for most cases in {{LOG_ONLY}} WAL mode. But this change leads to performance 
drop for cases when 
local thread writes data to the local node. It is valid for cases with 1-4 
threads. For larger amount of thread we have better performance.

In order to improve performance for case mentioned above we tried mapped file. 
This solution works very well and has better performance for any amount of 
threads. But {{MappedByteBuffer}} don't have method for partial fsync of data 
and for naive fsync ({{#force}} method) has performance drop for {{DEFAULT}} 
WAL mode.

So we have three implementations: master, single writer and mmap. 

See rough benchmarks results below (throughput ops/sec):

*BACKGROUND WAL mode*

| Threads | master | single writer | mmap |
| 1 | 70-80 K | 64-65 K | 70-80 K |
| 2 | 127-128 K | 100-110 K | 127-128 K |
| 4 | 190-200 K | 170-180 K | 190-200 K |
| 8 | 220-230K | 220-230 K | 220-230 K |

For {{BACKGROUND}} mode throughput values are comparable for all 
implementations.

*LOG_ONLY WAL mode*

| Threads | master | single writer | mmap |
| 1 | 60-65 K | 36-37 K | 70-80 K |
| 2 | 100-105 K | 60-70 K | 124-125 K |
| 4 | 130-140 K | 100-110 K | 190-200 K |
| 8 | 50-60 K | 150-160 K | 210-220 K |

For {{LOG_ONLY}} mode single writer performs better than master for larger 
amount of threads. But mmap is the best.

*DEFAULT WAL mode*

| Threads | master | single writer | mmap |
| 1 | 1 K | 1 K | 1 K |
| 2 | 1-2 K | 1-2 K | 1 K |
| 4 | 3-4 K | 3-4 K | 1 K |
| 8 | 5-6 K | 5-6 K | 1 K |

For {{DEFAULT}} mode we have performance drop with mmap. But it seems that 
partial fsync should solve it. Any way changes related with this issue allow 
switch between mmap and signle writer solution using system property.

*Note*: single writer and mmap still use {{fileIO.close()}} call that is 
interruptible. So {{ClosedByInterruption}} exception still has a chance to be 
thrown. This problem is still in TODO's that should be fixed.


was (Author: agura):
As solution the following are implemented:

- Ring buffer (see {{SegmentedRingByteBuffer}}) that reserve segment (buffer 
slice) for thread that want write WAL records. Thread serializes WAL records to 
the segment and calculate CRC if enabled. As result all threads serialize 
record in parallel manner to the ring buffer.
- Dedicated single thread that handles fsync requests and flushes data from the 
ring buffer to the file channel. This thread can't be interrupted via public 
API in distinction of current implementation. Threads that initiate fsync will 
be parked until data will be flushed to disk. 

This solution solves problem with interrupts and improve WAL write performance 
for most cases in {{LOG_ONLY}} WAL mode. But this change leads to performance 
drop for cases when 
local thread writes data to the local node. It is valid for cases with 1-4 
threads. For larger amount of thread we have better performance.

In order to improve performance for case mentioned above we tried mapped file. 
This solution works very well and has better performance for any amount of 
threads. But {{MappedByteBuffer}} don't have method for partial fsync of data 
and for naive fsync ({{#force}} method) has performance drop for {{DEFAULT}} 
WAL mode.

So we have three implementations: master, single writer and mmap. 

See rough benchmarks results below (throughput ops/sec):

*BACKGROUND WAL mode*

| Threads | master | single writer | mmap |
| 1 | 70-80 K | 64-65 K | 70-80 K |
| 2 | 127-128 K | 100-110 K | 127-128 K |
| 4 | 190-200 K | 170-180 K | 190-200 K |
| 8 | 220-230K | 220-230 K | 220-230 K |

For {{BACKGROUND}} mode throughput values are comparable for all 
implementations.

*LOG_ONLY WAL mode*

| Threads | master | single writer | mmap |
| 1 | 60-65 K | 36-37 K | 70-80 K |
| 2 | 100-105 K | 60-70 K | 124-125 K |
| 4 | 130-140 K | 100-110 K | 190-200 K |
| 8 | 50-60 K | 150-160 K | 210-220 K |

For {{LOG_ONLY}} mode single writer performs better than master for larger 
amount of threads. But mmap is the best.

*DEFAULT WAL mode*

| Threads | master | single writer | mmap |
| 1 | 1 K | 1 K | 1 K |
| 2 | 1-2 K | 1-2 K | 1 K |
| 4 | 3-4 K | 3-4 K | 1 K |
| 8 | 5-6 K | 5-6 K | 1 K |

For {{DEFAULT}} mode we have performance drop with mmap. But it seems that 
partial fsync should solve it. Any way changes related with this issue allow 
switch between mmap and signle writer solution using system property.

*Note*: single writer and mmap still use {{fileIO.close()}} call that is 
interruptible. So "closed by interruption" exception still has a chance to be 
thrown. This problem is still in TODO's that should be fixed.

> WAL: Avoid closed by interruption exception when user thread is interrupted
> ---------------------------------------------------------------------------
>
>                 Key: IGNITE-6339
>                 URL: https://issues.apache.org/jira/browse/IGNITE-6339
>             Project: Ignite
>          Issue Type: Bug
>          Components: persistence
>            Reporter: Andrey Gura
>            Assignee: Andrey Gura
>            Priority: Blocker
>              Labels: important
>             Fix For: 2.4
>
>
> We should have a separate writer thread for WAL that will write completed 
> serialized chunks of data. This will allow us avoid closed by interruption 
> exception when user thread is interrupted.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to