Got it - but this does mean log reclamation has to be dependent on
knowing that a successful checkpoint flush has occurred.
On 10/28/19 5:22 AM, Murtadha Hubail wrote:
@Mike Carey,
For this specific system checkpoint file, we have a test that ensures even if
the file is corrupted, the node will continue its startup and we will just
force recovery to start from the beginning of the transaction log.
@Chen Luo,
Thanks for checking and highlighting the issue. Looks like forcing a file on
closing its stream is OS dependent. Apparently, we also need
FileDescriptor.sync() for non-local storage to guarantee a file is forced. We
have other metadata files that we need to do this for. If you don't mind, I
will submit a patch to cover all of these and I will include the changes from
your patch [1].
Cheers,
Murtadha
[1] https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/3923
On 10/28/2019, 9:56 AM, "Mike Carey" <[email protected]> wrote:
Sounds right ... Explicit call needed! Better crash testing also needed;
this one should not be impossible to test for. Do we have any crash
testing that we can extend?
On Sun, Oct 27, 2019, 7:15 PM Chen Luo <[email protected]> wrote:
> HI Murtadha,
>
> I don't think closing a file will automatically force it to disk. It only
> forces the buffer from JVM to OS. I did some search online and also
checked
> the source code of JDK 8. When a file channel is closed, only the close
> function (provided by Linux) is called [1]. The only place that calls
fsync
> is inside the FileChannel.force() method [2]. For correctness, I believe
> force should be called explicitly, as we did when an LSM merge is
> completed.
>
> [1]
>
>
https://github.com/frohoff/jdk8u-jdk/blob/da0da73ab82ed714dc5be94acd2f0d00fbdfe2e9/src/solaris/native/sun/nio/ch/FileDispatcherImpl.c#L273
> [2]
>
>
https://github.com/frohoff/jdk8u-jdk/blob/da0da73ab82ed714dc5be94acd2f0d00fbdfe2e9/src/solaris/native/sun/nio/ch/FileDispatcherImpl.c#L145
>
> Best regards,
> Chen Luo
>
> On Sun, Oct 27, 2019 at 5:24 PM Murtadha Hubail <[email protected]>
> wrote:
>
> > Hi Chen,
> >
> > If I'm not mistaken, Files#write ensures that the file is closed after
> > writing the bytes which should flush the file. If not, then we probably
> > should add an explicit flush there.
> >
> > Cheers,
> > Murtadha
> >
> > On 10/27/2019, 8:26 PM, "Chen Luo" <[email protected]> wrote:
> >
> > Hi devs,
> >
> > I noticed the checkpoint file is not forced to disk after
completion,
> > but
> > we still proceed to truncate logs and older checkpoint files [1].
> This
> > seems to be a bug to me. Also, from my understanding, reading the
> > checkpoint file without forcing to disk will still succeed because
> the
> > file
> > can be read from the OS write cache. Is there any other
> considerations
> > for
> > not forcing checkpoint files?
> >
> >
> > [1]
> >
> >
>
https://github.com/apache/asterixdb/blob/2a76a0fe83fc5534c00923cd0f09f8477eac713a/asterixdb/asterix-transactions/src/main/java/org/apache/asterix/transaction/management/service/recovery/AbstractCheckpointManager.java#L176
> >
> >
> >
> >
>