[jira] [Updated] (CAMEL-8010) Race condition in AggregatorProcessor recovery sometimes causes duplicates (still)

Marc Carter (JIRA) Tue, 11 Nov 2014 15:48:08 -0800

     [ 
https://issues.apache.org/jira/browse/CAMEL-8010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Marc Carter updated CAMEL-8010:
-------------------------------
    Description: 
CAMEL-6097 Patched a pretty clear race condition between the completion thread 
(CT) and recovery thread (RT) but leaves several holes when exercised with a 
Jdbc repository and a separate aggregation thread (AT).

#1 is relevant to all repository backends.
#2 only affects fully transactional backends

I'm currently taking a look into this bug as its a show-stopper that 
_persistent_ repositories actually *decreases* reliability. (Untested) 
workaround is to add an in-memory idemptotentconsumer immediately after the 
aggregation.

Here AT starts and completes an aggregation between defensive copy and when RT 
repo scanning starts. CT then confirms it (in memory (*)) before repo scanning 
ends.

|| AT || RT || CT ||
| | inProg COPY to inProgCopy | |
| inProg ADD x | |  |
| repo START x | |  |
| repo REMOVE x | | |
| <commit> | |  |
| | repo SCAN (sees x) | |
| | | {color:red}process x{color} |
| | | repo CONFIRM x |
| | | inProg REMOVE x | 
| | | <commit> |
| | x not inProg or inProgCopy | |
| | {color:red}process x{color} | |
| | <commit> | |

(*) Side note: inProgressExchanges is updated by a {{Synchronisation}} inside 
the UOW so is immediately visible although any DB change may not be visible for 
ages (in threading terms) as the entire transaction must commit first.

  was:
CAMEL-6097 Patched a pretty clear race condition between the completion thread 
(CT) and recovery thread (RT) but leaves several holes when exercised with a 
Jdbc (ACID) repository.

#1 is relevant to all repository backends.
#2 only affects fully transactional backends

I'm currently taking a look into this bug as its a show-stopper that 
_persistent_ repositories actually *decreases* reliability. (Untested) 
workaround is to add an in-memory idemptotentconsumer immediately after the 
aggregation.

h4. Sequence  #1

Here CT starts and completes an aggregation between defensive copy and whenrepo 
scanning starts. CT then confirms it before repo scanning ends.

|| RT || CT ||
| inProg COPY to inProgCopy | |
| | inProg ADD x |
| | repo START x |
| | repo REMOVE x |
| | <commit> |
| repo SCAN (sees x) | |
| | {color:red}process x{color} |
| | repo CONFIRM x |
| | inProg REMOVE x | 
| | <commit> |
| x not inProg or inProgCopy | |
| {color:red}process x{color} | |
| <commit> | |

h4. Sequence  #2

More pernicious is that CT removes x from inProg and the database whilst still 
_inside_ the transaction. This means the inProg change is visible immediately 
to RT but the database change certainly is not (assuming rational deafult of 
READ_COMMITTED)

|| RT || CT ||
| | inProg ADD x |
| | repo START x |
| | repo REMOVE x |
| | <commit> |
| | {color:red}process x{color} |
| | repo CONFIRM x |
| | inProg REMOVE x | 
| inProg COPY to inProgCopy | |
| repo SCAN (sees x) | |
| | <commit> |
| x not inProg or inProgCopy | |
| {color:red}process x{color} | |
| <commit> | |


> Race condition in AggregatorProcessor recovery sometimes causes duplicates 
> (still)
> ----------------------------------------------------------------------------------
>
>                 Key: CAMEL-8010
>                 URL: https://issues.apache.org/jira/browse/CAMEL-8010
>             Project: Camel
>          Issue Type: Bug
>          Components: camel-core
>    Affects Versions: 2.14.0
>            Reporter: Marc Carter
>
> CAMEL-6097 Patched a pretty clear race condition between the completion 
> thread (CT) and recovery thread (RT) but leaves several holes when exercised 
> with a Jdbc repository and a separate aggregation thread (AT).
> #1 is relevant to all repository backends.
> #2 only affects fully transactional backends
> I'm currently taking a look into this bug as its a show-stopper that 
> _persistent_ repositories actually *decreases* reliability. (Untested) 
> workaround is to add an in-memory idemptotentconsumer immediately after the 
> aggregation.
> Here AT starts and completes an aggregation between defensive copy and when 
> RT repo scanning starts. CT then confirms it (in memory (*)) before repo 
> scanning ends.
> || AT || RT || CT ||
> | | inProg COPY to inProgCopy | |
> | inProg ADD x | |  |
> | repo START x | |  |
> | repo REMOVE x | | |
> | <commit> | |  |
> | | repo SCAN (sees x) | |
> | | | {color:red}process x{color} |
> | | | repo CONFIRM x |
> | | | inProg REMOVE x | 
> | | | <commit> |
> | | x not inProg or inProgCopy | |
> | | {color:red}process x{color} | |
> | | <commit> | |
> (*) Side note: inProgressExchanges is updated by a {{Synchronisation}} inside 
> the UOW so is immediately visible although any DB change may not be visible 
> for ages (in threading terms) as the entire transaction must commit first.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CAMEL-8010) Race condition in AggregatorProcessor recovery sometimes causes duplicates (still)

Reply via email to