Hi folks,

 I've got the broken journal again (mbsync 1.2.1 from Debian distribution).

# ls -al
total 300
drwx--x--x 5 berd zp     4096 May 29 12:44 .
drwxrwsr-x 5 root mail   4096 Jan 11  2015 ..
-rw-r--r-- 1 berd zp    12444 May 27 20:06 .mbsyncstate
-rw-r--r-- 1 berd zp       33 May 27 20:07 .mbsyncstate.journal
-rw-r--r-- 1 berd zp    12444 May 27 20:07 .mbsyncstate.new
-rw------- 1 berd zp       18 May 27 20:02 .uidvalidity
drwx--x--x 2 berd zp   118784 May 26 15:19 cur
drwx--x--x 2 berd zp   126976 May 27 20:02 new
drwx--x--x 2 berd zp     4096 May 27 20:02 tmp

# hd .mbsyncstate.journal
00000000  32 0a 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |2...............|
00000010  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00000021

# stat .mbsyncstate.journal
  File: .mbsyncstate.journal
  Size: 33              Blocks: 8          IO Block: 4096   regular file
Device: 808h/2056d      Inode: 146871      Links: 1
Access: (0644/-rw-r--r--)  Uid: (24826/    berd)   Gid: ( 1307/      zp)
Access: 2017-05-29 12:34:11.859035473 +0300
Modify: 2017-05-27 20:07:12.235838762 +0300
Change: 2017-05-27 20:07:12.235838762 +0300
 Birth: -

 Files .mbsyncstate and .mbsyncstate.new are identical. In this case
 reason of the crash was power fail 27.05 20:07, causing host to reboot.
 Thank you for attention.


On Mon, Dec 26, 2016 at 03:43:37PM +0300, Evgeniy Berdnikov wrote:
> On Mon, Dec 26, 2016 at 01:06:48PM +0100, Yuri D'Elia wrote:
> > On Mon, Dec 26 2016, Oswald Buddenhagen wrote:
> > > even after looking at the code several more times, i plain can't see how
> > > this would be possible. it seems that *someting* is pre-allocating space
> > > in the file, but is getting interrupted before writing the actual data.
> > > alternatively, it's actually appending nulls for some bizarre reason.
> > 
> > This looks like an initial open/creat, where a write was interrupted
> > with no flush. The nulls should either be a multiple size of stdbuf (if
> > libc regular IO is used) or the default block alloc of the underlying
> > fs (depending on where the write was interrupted).
> 
>  In my case broken journal file has 33 bytes: "2", "\n" and 31 NULLs.
>  But file allocation should be multiple of 512 bytes, and I suspect the
>  defult libc buffer size is much higher than 31 and multiple of 2.
> 
> > The nulls are there because the space was allocated, but nobody wrote
> > into it yet.
> 
>  This assumption is quite natural, but it conradicts at least two facts:
>  1. small data record sizes and 2. the model of file operation (unflushed
>  data can appear in a file only on crash/reboot or severe kernel bug).
> 
> > > ltrace and strace of the process could help for starters - if something
> > > fishy is done from user space, this should become immediately obvious.
> > 
> > I had to stop logging at some point, as I couldn't reproduce it when
> > running under strace (probably due to the added latencies).
> 
>  No doubts, this bug is rare, so it's very hard to reproduce it.
> 
>  I've got damage when internet route to server was very slow and unstable
>  (high %loss and frequent breaks). Probably bug is triggered on processing
>  of network timeout condition. It may be some write() to wrong fd and/or
>  with erroneous pointer to data.
> -- 
>  Eugene Berdnikov
-- 
 Eugene Berdnikov

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
isync-devel mailing list
isync-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/isync-devel

Reply via email to