[ 
https://issues.apache.org/jira/browse/QPID-8500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17402875#comment-17402875
 ] 

Alex Rudyy commented on QPID-8500:
----------------------------------

Hi [[email protected]],
My current understanding of the reported issue is that the feeder replication 
fails because replicated data has not been synced to disk, but the feeder tried 
to read the data from the disk before it was fully written.  In general, I 
believe that it is a defect in BDB JE library used by Qpid for implementation 
of HA. Thus, the issue should be fixed there. The implemented change is 
essentially an attempt to work around BDB JE issue by turning off the 
coalescing committer and switching to a stricter durability policy  
"SYNC,SYNC,SIMPLE_MAJORITY" or "SYNC,NO_SYNC,SIMPLE_MAJORITY".

If you were able to reproduce the reported issue with Qpid Broker 8.0.5 and 
context qpid.bdb.ha.disable_coalescing_committer=true, I suppose you can try 
changing further your durability to SYNC,SYNC,SIMPLE_MAJORITY. If that will not 
help, than,  the only resolution for the issue would a fix in BDB JE and 
upgrade to newer version of BDB on Qpid side.



> [Broker-J] Introduce a switch to disable coalescing committer in BDB HA 
> message store
> -------------------------------------------------------------------------------------
>
>                 Key: QPID-8500
>                 URL: https://issues.apache.org/jira/browse/QPID-8500
>             Project: Qpid
>          Issue Type: Improvement
>          Components: Broker-J
>            Reporter: Alex Rudyy
>            Priority: Major
>             Fix For: qpid-java-broker-8.0.4, qpid-java-broker-7.1.12
>
>
> A BDB JE replication Feeder fails sporadically with errors like the one below
> {noformat}
> Halted log file reading at file 0x7472c8 offset 0x199d07 
> offset(decimal)=1678599 prev=0x199cd5:
> entry=DEL_LN_TXtype=31,version=14)
> prev=0x199cd5
> size=44
> Next entry should be at 0x199d49
> com.sleepycat.je.EnvironmentFailureException: (JE 7.4.5) want to read 
> 52,431,066,320 but reader at 52,431,066,327 UNEXPECTED_STATE: Unexpected 
> internal state, may have side effects.
>         at 
> com.sleepycat.je.EnvironmentFailureException.unexpectedState(EnvironmentFailureException.java:428)
>         at 
> com.sleepycat.je.rep.stream.FeederReader.checkForPassingTarget(FeederReader.java:297)
>         at 
> com.sleepycat.je.rep.stream.FeederReader.isTargetEntry(FeederReader.java:317)
>         at 
> com.sleepycat.je.log.FileReader.readNextEntryAllowExceptions(FileReader.java:332)
>         at com.sleepycat.je.log.FileReader.readNextEntry(FileReader.java:245)
>         at 
> com.sleepycat.je.rep.stream.FeederReader.scanForwards(FeederReader.java:280)
>         at 
> com.sleepycat.je.rep.stream.MasterFeederSource.getWireRecord(MasterFeederSource.java:70)
>         at 
> com.sleepycat.je.rep.impl.node.Feeder$OutputThread.writeAvailableEntries(Feeder.java:1266)
>         at 
> com.sleepycat.je.rep.impl.node.Feeder$OutputThread.run(Feeder.java:1144)
> {noformat}
> Based on discussion at 
> [https://community.oracle.com/tech/developers/discussion/4300421/master-fails-unexpectedly-due-to-feeder-output-halted-log-file-reading-at-file-0x334f63-offset-0x8ce]
>  we need a way to configure broker without a coalescing committer. The local 
> sync policy would be set as per user virtual host settings.
> A context variable can be added into BDB HA to disable coalescing committer 
> thread.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to