Re: r327359: cylinder checksum failed: cg0, cgp: 0x4515d2a3 != bp: 0xd9fba319 Dec 30 23:29:24 <0.2>

2018-01-25 Thread O. Hartmann
Am Mon, 8 Jan 2018 09:12:16 -0700
Warner Losh  schrieb:

> On Jan 8, 2018 8:34 AM, "Mark Johnston"  wrote:
> 
> On Thu, Jan 04, 2018 at 09:10:37AM +0100, Michael Tuexen wrote:
> > > On 31. Dec 2017, at 02:45, Warner Losh  wrote:
> > >
> > > On Sat, Dec 30, 2017 at 4:41 PM, O. Hartmann   
> wrote:
> > >  
> > >> On most recent CURRENT I face the error shwon below on /tmp filesystem
> > >> (UFS2) residing
> > >> on a Samsung 850 Pro SSD:
> > >>
> > >> UFS /dev/gpt/tmp (/tmp) cylinder checksum failed: cg 0, cgp:  
> 0x4515d2a3 !=
> > >> bp: 0xd9fba319
> > >> handle_workitem_freefile: got error 5 while accessing filesystem
> > >> UFS /dev/gpt/tmp (/tmp) cylinder checksum failed: cg 0, cgp: 0x4515d2a3
> > >> != bp: 0xd9fba319
> > >> handle_workitem_freefile: got error 5 while accessing filesystem
> > >> UFS /dev/gpt/tmp (/tmp) cylinder checksum failed: cg 0, cgp: 0x4515d2a3
> > >> != bp: 0xd9fba319
> > >> handle_workitem_freefile: got error 5 while accessing filesystem
> > >> UFS /dev/gpt/tmp (/tmp) cylinder checksum failed: cg 0, cgp: 0x4515d2a3
> > >> != bp: 0xd9fba319
> > >> handle_workitem_freefile: got error 5 while accessing filesystem
> > >> UFS /dev/gpt/tmp (/tmp) cylinder checksum failed: cg 0, cgp: 0x4515d2a3
> > >> != bp: 0xd9fba319
> > >> handle_workitem_freefile: got error 5 while accessing filesystem
> > >>
> > >> I've already formatted the /tmp filesystem, but obviously without any
> > >> success.
> > >>
> > >> Since I face such strange errors also on NanoBSD images dd'ed to SD  
> cards,
> > >> I guess there
> > >> is something fishy ...  
> > >
> > >
> > > It indicates a problem. We've seen these 'corruptions' on data in  
> motion at
> > > work, but I hacked fsck to report checksum mismatches (it silently  
> corrects
> > > them today) and we've not seen any mismatch when we unmount and fsck the
> > > filesystem.  
> > Not sure this helps: But we have seen this also after system panics
> > when having soft update journaling enabled. Having soft update journaling
> > disabled, we do not observed this after several panics.
> > Just to be clear: The panics are not related to this issue,
> > but to other network development we do.  
> 
> I saw the same issue this morning on a mirrored root filesystem after my
> workstation came up following a power failure. fsck recovered using the
> journal, and I subsequently saw a number of these checksum failures.
> Upon shutdown, I saw the same handle_workitem_freefile errors as above.
> I then ran a full fsck from single-user mode, which didn't turn up any
> inconsistencies, and after that the checksum failure errors disappeared,
> presumably because fsck fixed them.
> 
> 
> Yes. Fsck automatically fixes issues like that. It does it silently. I have
> patched to make it noisy, and the dozen cases I saw the errors, fsck was
> silent with my whiny patches. I can put them up for review if people want...
> 
> Warner


within the past couple of weeks - or since the first occurence of these strange 
reports,
I have had mysterious crashes: when installing FreeBSD even the proper 
(recommended) way,
the box suddenly crashes out of the blue. The symptoms are always the same and 
the result
is also always the same: the box is unusable, the boot process is stuck at BTX 
halted
with a list of dumped CPU registers (I guess it is the CPU registers) and the 
filesystem
is corrupt. I have had this strange problem on several hosts with SSDs - I 
reported end
November/beginning of December 2017 of those crashes. On on machine I refomated 
the SSD
and did a playback from ab 'dump'-backup - since then those crashes went away. 
The box
now in question is the last of them not being traeted that way. it seems, there 
is
somewhere/somehow a minefield hidden and I have no clue what it could be :-(

I'm going to do the very same soon with the SSD of the remaining box - dump and 
restore.

I just wanted to note this for the record.

The crash happend with  FreeBSD 12.0-CURRENT #14 r328409: Thu Jan 25 20:40:27 
CET amd64.

Kind regards,

Oliver
2018 

-- 
O. Hartmann

Ich widerspreche der Nutzung oder Übermittlung meiner Daten für
Werbezwecke oder für die Markt- oder Meinungsforschung (§ 28 Abs. 4 BDSG).


pgpU4VFBWnucP.pgp
Description: OpenPGP digital signature


Re: r327359: cylinder checksum failed: cg0, cgp: 0x4515d2a3 != bp: 0xd9fba319 Dec 30 23:29:24 <0.2>

2018-01-12 Thread Warner Losh
On Fri, Jan 12, 2018 at 8:28 AM, Ed Maste  wrote:

> On 4 January 2018 at 03:10, Michael Tuexen  wrote:
> >
> > Not sure this helps: But we have seen this also after system panics
> > when having soft update journaling enabled. Having soft update journaling
> > disabled, we do not observed this after several panics.
> > Just to be clear: The panics are not related to this issue,
> > but to other network development we do.
>
> Both of my new co-op students have encountered this as well: after a
> panic (unrelated to the filesystem), SU+J fsck recovery runs at boot,
> and than many cylinder checksum warnings are emitted by the kernel.
> The students used the default installer configuration; it sounds like
> we should disable SU+J by default in the installer until this issue is
> addressed.
>

I've posted https://reviews.freebsd.org/D13884 as well to change fsck from
silently fixing the cg checksum to one that's whiny about it as well.

Warner
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: r327359: cylinder checksum failed: cg0, cgp: 0x4515d2a3 != bp: 0xd9fba319 Dec 30 23:29:24 <0.2>

2018-01-12 Thread Ed Maste
On 4 January 2018 at 03:10, Michael Tuexen  wrote:
>
> Not sure this helps: But we have seen this also after system panics
> when having soft update journaling enabled. Having soft update journaling
> disabled, we do not observed this after several panics.
> Just to be clear: The panics are not related to this issue,
> but to other network development we do.

Both of my new co-op students have encountered this as well: after a
panic (unrelated to the filesystem), SU+J fsck recovery runs at boot,
and than many cylinder checksum warnings are emitted by the kernel.
The students used the default installer configuration; it sounds like
we should disable SU+J by default in the installer until this issue is
addressed.
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: r327359: cylinder checksum failed: cg0, cgp: 0x4515d2a3 != bp: 0xd9fba319 Dec 30 23:29:24 <0.2>

2018-01-08 Thread Warner Losh
On Jan 8, 2018 8:34 AM, "Mark Johnston"  wrote:

On Thu, Jan 04, 2018 at 09:10:37AM +0100, Michael Tuexen wrote:
> > On 31. Dec 2017, at 02:45, Warner Losh  wrote:
> >
> > On Sat, Dec 30, 2017 at 4:41 PM, O. Hartmann 
wrote:
> >
> >> On most recent CURRENT I face the error shwon below on /tmp filesystem
> >> (UFS2) residing
> >> on a Samsung 850 Pro SSD:
> >>
> >> UFS /dev/gpt/tmp (/tmp) cylinder checksum failed: cg 0, cgp:
0x4515d2a3 !=
> >> bp: 0xd9fba319
> >> handle_workitem_freefile: got error 5 while accessing filesystem
> >> UFS /dev/gpt/tmp (/tmp) cylinder checksum failed: cg 0, cgp: 0x4515d2a3
> >> != bp: 0xd9fba319
> >> handle_workitem_freefile: got error 5 while accessing filesystem
> >> UFS /dev/gpt/tmp (/tmp) cylinder checksum failed: cg 0, cgp: 0x4515d2a3
> >> != bp: 0xd9fba319
> >> handle_workitem_freefile: got error 5 while accessing filesystem
> >> UFS /dev/gpt/tmp (/tmp) cylinder checksum failed: cg 0, cgp: 0x4515d2a3
> >> != bp: 0xd9fba319
> >> handle_workitem_freefile: got error 5 while accessing filesystem
> >> UFS /dev/gpt/tmp (/tmp) cylinder checksum failed: cg 0, cgp: 0x4515d2a3
> >> != bp: 0xd9fba319
> >> handle_workitem_freefile: got error 5 while accessing filesystem
> >>
> >> I've already formatted the /tmp filesystem, but obviously without any
> >> success.
> >>
> >> Since I face such strange errors also on NanoBSD images dd'ed to SD
cards,
> >> I guess there
> >> is something fishy ...
> >
> >
> > It indicates a problem. We've seen these 'corruptions' on data in
motion at
> > work, but I hacked fsck to report checksum mismatches (it silently
corrects
> > them today) and we've not seen any mismatch when we unmount and fsck the
> > filesystem.
> Not sure this helps: But we have seen this also after system panics
> when having soft update journaling enabled. Having soft update journaling
> disabled, we do not observed this after several panics.
> Just to be clear: The panics are not related to this issue,
> but to other network development we do.

I saw the same issue this morning on a mirrored root filesystem after my
workstation came up following a power failure. fsck recovered using the
journal, and I subsequently saw a number of these checksum failures.
Upon shutdown, I saw the same handle_workitem_freefile errors as above.
I then ran a full fsck from single-user mode, which didn't turn up any
inconsistencies, and after that the checksum failure errors disappeared,
presumably because fsck fixed them.


Yes. Fsck automatically fixes issues like that. It does it silently. I have
patched to make it noisy, and the dozen cases I saw the errors, fsck was
silent with my whiny patches. I can put them up for review if people want...

Warner
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: r327359: cylinder checksum failed: cg0, cgp: 0x4515d2a3 != bp: 0xd9fba319 Dec 30 23:29:24 <0.2>

2018-01-08 Thread Mark Johnston
On Thu, Jan 04, 2018 at 09:10:37AM +0100, Michael Tuexen wrote:
> > On 31. Dec 2017, at 02:45, Warner Losh  wrote:
> > 
> > On Sat, Dec 30, 2017 at 4:41 PM, O. Hartmann  wrote:
> > 
> >> On most recent CURRENT I face the error shwon below on /tmp filesystem
> >> (UFS2) residing
> >> on a Samsung 850 Pro SSD:
> >> 
> >> UFS /dev/gpt/tmp (/tmp) cylinder checksum failed: cg 0, cgp: 0x4515d2a3 !=
> >> bp: 0xd9fba319
> >> handle_workitem_freefile: got error 5 while accessing filesystem
> >> UFS /dev/gpt/tmp (/tmp) cylinder checksum failed: cg 0, cgp: 0x4515d2a3
> >> != bp: 0xd9fba319
> >> handle_workitem_freefile: got error 5 while accessing filesystem
> >> UFS /dev/gpt/tmp (/tmp) cylinder checksum failed: cg 0, cgp: 0x4515d2a3
> >> != bp: 0xd9fba319
> >> handle_workitem_freefile: got error 5 while accessing filesystem
> >> UFS /dev/gpt/tmp (/tmp) cylinder checksum failed: cg 0, cgp: 0x4515d2a3
> >> != bp: 0xd9fba319
> >> handle_workitem_freefile: got error 5 while accessing filesystem
> >> UFS /dev/gpt/tmp (/tmp) cylinder checksum failed: cg 0, cgp: 0x4515d2a3
> >> != bp: 0xd9fba319
> >> handle_workitem_freefile: got error 5 while accessing filesystem
> >> 
> >> I've already formatted the /tmp filesystem, but obviously without any
> >> success.
> >> 
> >> Since I face such strange errors also on NanoBSD images dd'ed to SD cards,
> >> I guess there
> >> is something fishy ...
> > 
> > 
> > It indicates a problem. We've seen these 'corruptions' on data in motion at
> > work, but I hacked fsck to report checksum mismatches (it silently corrects
> > them today) and we've not seen any mismatch when we unmount and fsck the
> > filesystem.
> Not sure this helps: But we have seen this also after system panics
> when having soft update journaling enabled. Having soft update journaling
> disabled, we do not observed this after several panics.
> Just to be clear: The panics are not related to this issue,
> but to other network development we do.

I saw the same issue this morning on a mirrored root filesystem after my
workstation came up following a power failure. fsck recovered using the
journal, and I subsequently saw a number of these checksum failures.
Upon shutdown, I saw the same handle_workitem_freefile errors as above.
I then ran a full fsck from single-user mode, which didn't turn up any
inconsistencies, and after that the checksum failure errors disappeared,
presumably because fsck fixed them.
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: r327359: cylinder checksum failed: cg0, cgp: 0x4515d2a3 != bp: 0xd9fba319 Dec 30 23:29:24 <0.2>

2018-01-07 Thread Chris H

On Sun, 7 Jan 2018 12:31:34 +0100 "O. Hartmann"  said


Am Thu, 4 Jan 2018 12:14:47 +0100
"O. Hartmann"  schrieb:

> On Thu, 4 Jan 2018 09:10:37 +0100
> Michael Tuexen  wrote:
> 
> > > On 31. Dec 2017, at 02:45, Warner Losh  wrote:
> > > 
> > > On Sat, Dec 30, 2017 at 4:41 PM, O. Hartmann 

> wrote:
> > > 
> > >> On most recent CURRENT I face the error shwon below on /tmp filesystem

> > >> (UFS2) residing
> > >> on a Samsung 850 Pro SSD:
> > >> 
> > >> UFS /dev/gpt/tmp (/tmp) cylinder checksum failed: cg 0, cgp: 0x4515d2a3

> !=
> > >> bp: 0xd9fba319
> > >> handle_workitem_freefile: got error 5 while accessing filesystem
> > >> UFS /dev/gpt/tmp (/tmp) cylinder checksum failed: cg 0, cgp: 0x4515d2a3
> > >> != bp: 0xd9fba319
> > >> handle_workitem_freefile: got error 5 while accessing filesystem
> > >> UFS /dev/gpt/tmp (/tmp) cylinder checksum failed: cg 0, cgp: 0x4515d2a3
> > >> != bp: 0xd9fba319
> > >> handle_workitem_freefile: got error 5 while accessing filesystem
> > >> UFS /dev/gpt/tmp (/tmp) cylinder checksum failed: cg 0, cgp: 0x4515d2a3
> > >> != bp: 0xd9fba319
> > >> handle_workitem_freefile: got error 5 while accessing filesystem
> > >> UFS /dev/gpt/tmp (/tmp) cylinder checksum failed: cg 0, cgp: 0x4515d2a3
> > >> != bp: 0xd9fba319
> > >> handle_workitem_freefile: got error 5 while accessing filesystem
> > >> 
> > >> I've already formatted the /tmp filesystem, but obviously without any

> > >> success.
> > >> 
> > >> Since I face such strange errors also on NanoBSD images dd'ed to SD

> cards,
> > >> I guess there
> > >> is something fishy ...
> > > 
> > > 
> > > It indicates a problem. We've seen these 'corruptions' on data in motion

> at
> > > work, but I hacked fsck to report checksum mismatches (it silently
> corrects
> > > them today) and we've not seen any mismatch when we unmount and fsck the
> > > filesystem.
> > Not sure this helps: But we have seen this also after system panics

> > when having soft update journaling enabled. Having soft update journaling
> > disabled, we do not observed this after several panics.
> > Just to be clear: The panics are not related to this issue,
> > but to other network development we do.
> > 
> > You can check using tunefs -p devname if soft update journaling is enabled

> or
> > not.  
> 
> In all cases I reported in earlier and now, softupdates ARE ENABLED on all

> partitions in question (always GPT, in my cases also all on flash based
> devices, SD card and/or SSD).


... and journalling as well!

In case of the SD, I produced the layout of the NanoBSD image via "dd"
including the /cfg
partition. The problem occured even when having overwritten the SD card with
a new image.
The problem went away once I unmounted /cfg and reformatted via newfs. After
that, I did
not see any faults again! I have no explanation for this behaviour except the
dd didn't
overwrite "faulty" areas or the obligate "gpart recover" at the end of the
procedure
restored something faulty.

The /tmp filesystem I reported in was also from an earlier date - and I
didn't formatted
it as I said - I confused the partition in question with another one. The
partition has
been created and formatted months ago under CURRENT.

In single user mode, I reformatted the partition again - with journaling and
softupdates
enabled. As with the /cfg partition on NanoBSD with SD card, I didn't realise
any faults
again since then. 


FWIW I *also* experience this on gpart/FFS2 partitioned/formatted drives
*with* journaling enabled. As a result; if the system crashes, more often
times, than not, fsck(8) canNOT use the journal, and indicates that it
must "fall through" to complete the task. This is on a SATA (ahci) driven
disk. My experiences with this seem to suggest that journaling is the cause.
> 
> 
> > 
> > Best regards
> > Michael  
> > > 
> > > Warner

--
O. Hartmann

Ich widerspreche der Nutzung oder Übermittlung meiner Daten für
Werbezwecke oder für die Markt- oder Meinungsforschung (§ 28 Abs. 4 BDSG).

--Chris


___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: r327359: cylinder checksum failed: cg0, cgp: 0x4515d2a3 != bp: 0xd9fba319 Dec 30 23:29:24 <0.2>

2018-01-07 Thread Chris H

On Sun, 7 Jan 2018 12:31:34 +0100 "O. Hartmann"  said


Am Thu, 4 Jan 2018 12:14:47 +0100
"O. Hartmann"  schrieb:

> On Thu, 4 Jan 2018 09:10:37 +0100
> Michael Tuexen  wrote:
> 
> > > On 31. Dec 2017, at 02:45, Warner Losh  wrote:
> > > 
> > > On Sat, Dec 30, 2017 at 4:41 PM, O. Hartmann 

> wrote:
> > > 
> > >> On most recent CURRENT I face the error shwon below on /tmp filesystem

> > >> (UFS2) residing
> > >> on a Samsung 850 Pro SSD:
> > >> 
> > >> UFS /dev/gpt/tmp (/tmp) cylinder checksum failed: cg 0, cgp: 0x4515d2a3

> !=
> > >> bp: 0xd9fba319
> > >> handle_workitem_freefile: got error 5 while accessing filesystem
> > >> UFS /dev/gpt/tmp (/tmp) cylinder checksum failed: cg 0, cgp: 0x4515d2a3
> > >> != bp: 0xd9fba319
> > >> handle_workitem_freefile: got error 5 while accessing filesystem
> > >> UFS /dev/gpt/tmp (/tmp) cylinder checksum failed: cg 0, cgp: 0x4515d2a3
> > >> != bp: 0xd9fba319
> > >> handle_workitem_freefile: got error 5 while accessing filesystem
> > >> UFS /dev/gpt/tmp (/tmp) cylinder checksum failed: cg 0, cgp: 0x4515d2a3
> > >> != bp: 0xd9fba319
> > >> handle_workitem_freefile: got error 5 while accessing filesystem
> > >> UFS /dev/gpt/tmp (/tmp) cylinder checksum failed: cg 0, cgp: 0x4515d2a3
> > >> != bp: 0xd9fba319
> > >> handle_workitem_freefile: got error 5 while accessing filesystem
> > >> 
> > >> I've already formatted the /tmp filesystem, but obviously without any

> > >> success.
> > >> 
> > >> Since I face such strange errors also on NanoBSD images dd'ed to SD

> cards,
> > >> I guess there
> > >> is something fishy ...
> > > 
> > > 
> > > It indicates a problem. We've seen these 'corruptions' on data in motion

> at
> > > work, but I hacked fsck to report checksum mismatches (it silently
> corrects
> > > them today) and we've not seen any mismatch when we unmount and fsck the
> > > filesystem.
> > Not sure this helps: But we have seen this also after system panics

> > when having soft update journaling enabled. Having soft update journaling
> > disabled, we do not observed this after several panics.
> > Just to be clear: The panics are not related to this issue,
> > but to other network development we do.
> > 
> > You can check using tunefs -p devname if soft update journaling is enabled

> or
> > not.  
> 
> In all cases I reported in earlier and now, softupdates ARE ENABLED on all

> partitions in question (always GPT, in my cases also all on flash based
> devices, SD card and/or SSD).


... and journalling as well!

In case of the SD, I produced the layout of the NanoBSD image via "dd"
including the /cfg
partition. The problem occured even when having overwritten the SD card with
a new image.
The problem went away once I unmounted /cfg and reformatted via newfs. After
that, I did
not see any faults again! I have no explanation for this behaviour except the
dd didn't
overwrite "faulty" areas or the obligate "gpart recover" at the end of the
procedure
restored something faulty.

The /tmp filesystem I reported in was also from an earlier date - and I
didn't formatted
it as I said - I confused the partition in question with another one. The
partition has
been created and formatted months ago under CURRENT.

In single user mode, I reformatted the partition again - with journaling and
softupdates
enabled. As with the /cfg partition on NanoBSD with SD card, I didn't realise
any faults
again since then. 

FWIW I *also* experience this on gpart/FFS2 partitioned/formatted drives
*with* journaling enabled. As a result; if the system crashes, more often
times, than not, fsck(8) canNOT use the journal, and indicates that it
must "fall through" to complete the task. This is on a SATA (ahci) driven
disk. My experiences with this seem to suggest that journaling is the cause.



> 
> 
> > 
> > Best regards
> > Michael  
> > > 
> > > Warner

--
O. Hartmann

Ich widerspreche der Nutzung oder Übermittlung meiner Daten für
Werbezwecke oder für die Markt- oder Meinungsforschung (§ 28 Abs. 4 BDSG).


--Chris


___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: r327359: cylinder checksum failed: cg0, cgp: 0x4515d2a3 != bp: 0xd9fba319 Dec 30 23:29:24 <0.2>

2018-01-07 Thread O. Hartmann
Am Thu, 4 Jan 2018 12:14:47 +0100
"O. Hartmann"  schrieb:

> On Thu, 4 Jan 2018 09:10:37 +0100
> Michael Tuexen  wrote:
> 
> > > On 31. Dec 2017, at 02:45, Warner Losh  wrote:
> > > 
> > > On Sat, Dec 30, 2017 at 4:41 PM, O. Hartmann  
> > > wrote:
> > > 
> > >> On most recent CURRENT I face the error shwon below on /tmp filesystem
> > >> (UFS2) residing
> > >> on a Samsung 850 Pro SSD:
> > >> 
> > >> UFS /dev/gpt/tmp (/tmp) cylinder checksum failed: cg 0, cgp: 0x4515d2a3 
> > >> !=
> > >> bp: 0xd9fba319
> > >> handle_workitem_freefile: got error 5 while accessing filesystem
> > >> UFS /dev/gpt/tmp (/tmp) cylinder checksum failed: cg 0, cgp: 0x4515d2a3
> > >> != bp: 0xd9fba319
> > >> handle_workitem_freefile: got error 5 while accessing filesystem
> > >> UFS /dev/gpt/tmp (/tmp) cylinder checksum failed: cg 0, cgp: 0x4515d2a3
> > >> != bp: 0xd9fba319
> > >> handle_workitem_freefile: got error 5 while accessing filesystem
> > >> UFS /dev/gpt/tmp (/tmp) cylinder checksum failed: cg 0, cgp: 0x4515d2a3
> > >> != bp: 0xd9fba319
> > >> handle_workitem_freefile: got error 5 while accessing filesystem
> > >> UFS /dev/gpt/tmp (/tmp) cylinder checksum failed: cg 0, cgp: 0x4515d2a3
> > >> != bp: 0xd9fba319
> > >> handle_workitem_freefile: got error 5 while accessing filesystem
> > >> 
> > >> I've already formatted the /tmp filesystem, but obviously without any
> > >> success.
> > >> 
> > >> Since I face such strange errors also on NanoBSD images dd'ed to SD 
> > >> cards,
> > >> I guess there
> > >> is something fishy ...
> > > 
> > > 
> > > It indicates a problem. We've seen these 'corruptions' on data in motion 
> > > at
> > > work, but I hacked fsck to report checksum mismatches (it silently 
> > > corrects
> > > them today) and we've not seen any mismatch when we unmount and fsck the
> > > filesystem.
> > Not sure this helps: But we have seen this also after system panics
> > when having soft update journaling enabled. Having soft update journaling
> > disabled, we do not observed this after several panics.
> > Just to be clear: The panics are not related to this issue,
> > but to other network development we do.
> > 
> > You can check using tunefs -p devname if soft update journaling is enabled 
> > or
> > not.  
> 
> In all cases I reported in earlier and now, softupdates ARE ENABLED on all
> partitions in question (always GPT, in my cases also all on flash based
> devices, SD card and/or SSD).


... and journalling as well!

In case of the SD, I produced the layout of the NanoBSD image via "dd" 
including the /cfg
partition. The problem occured even when having overwritten the SD card with a 
new image.
The problem went away once I unmounted /cfg and reformatted via newfs. After 
that, I did
not see any faults again! I have no explanation for this behaviour except the 
dd didn't
overwrite "faulty" areas or the obligate "gpart recover" at the end of the 
procedure
restored something faulty.

The /tmp filesystem I reported in was also from an earlier date - and I didn't 
formatted
it as I said - I confused the partition in question with another one. The 
partition has
been created and formatted months ago under CURRENT.

In single user mode, I reformatted the partition again - with journaling and 
softupdates
enabled. As with the /cfg partition on NanoBSD with SD card, I didn't realise 
any faults
again since then. 

> 
> 
> > 
> > Best regards
> > Michael  
> > > 
> > > Warner
> > > ___
> > > freebsd-current@freebsd.org mailing list
> > > https://lists.freebsd.org/mailman/listinfo/freebsd-current
> > > To unsubscribe, send any mail to 
> > > "freebsd-current-unsubscr...@freebsd.org"
> > 
> > ___
> > freebsd-current@freebsd.org mailing list
> > https://lists.freebsd.org/mailman/listinfo/freebsd-current
> > To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"  
> 



-- 
O. Hartmann

Ich widerspreche der Nutzung oder Übermittlung meiner Daten für
Werbezwecke oder für die Markt- oder Meinungsforschung (§ 28 Abs. 4 BDSG).


pgpjfX7synlMU.pgp
Description: OpenPGP digital signature


Re: r327359: cylinder checksum failed: cg0, cgp: 0x4515d2a3 != bp: 0xd9fba319 Dec 30 23:29:24 <0.2>

2018-01-04 Thread Michael Tuexen
> On 4. Jan 2018, at 12:14, O. Hartmann  wrote:
> 
> On Thu, 4 Jan 2018 09:10:37 +0100
> Michael Tuexen  wrote:
> 
>>> On 31. Dec 2017, at 02:45, Warner Losh  wrote:
>>> 
>>> On Sat, Dec 30, 2017 at 4:41 PM, O. Hartmann  wrote:
>>> 
 On most recent CURRENT I face the error shwon below on /tmp filesystem
 (UFS2) residing
 on a Samsung 850 Pro SSD:
 
 UFS /dev/gpt/tmp (/tmp) cylinder checksum failed: cg 0, cgp: 0x4515d2a3 !=
 bp: 0xd9fba319
 handle_workitem_freefile: got error 5 while accessing filesystem
 UFS /dev/gpt/tmp (/tmp) cylinder checksum failed: cg 0, cgp: 0x4515d2a3
 != bp: 0xd9fba319
 handle_workitem_freefile: got error 5 while accessing filesystem
 UFS /dev/gpt/tmp (/tmp) cylinder checksum failed: cg 0, cgp: 0x4515d2a3
 != bp: 0xd9fba319
 handle_workitem_freefile: got error 5 while accessing filesystem
 UFS /dev/gpt/tmp (/tmp) cylinder checksum failed: cg 0, cgp: 0x4515d2a3
 != bp: 0xd9fba319
 handle_workitem_freefile: got error 5 while accessing filesystem
 UFS /dev/gpt/tmp (/tmp) cylinder checksum failed: cg 0, cgp: 0x4515d2a3
 != bp: 0xd9fba319
 handle_workitem_freefile: got error 5 while accessing filesystem
 
 I've already formatted the /tmp filesystem, but obviously without any
 success.
 
 Since I face such strange errors also on NanoBSD images dd'ed to SD cards,
 I guess there
 is something fishy ...  
>>> 
>>> 
>>> It indicates a problem. We've seen these 'corruptions' on data in motion at
>>> work, but I hacked fsck to report checksum mismatches (it silently corrects
>>> them today) and we've not seen any mismatch when we unmount and fsck the
>>> filesystem.  
>> Not sure this helps: But we have seen this also after system panics
>> when having soft update journaling enabled. Having soft update journaling
>> disabled, we do not observed this after several panics.
>> Just to be clear: The panics are not related to this issue,
>> but to other network development we do.
>> 
>> You can check using tunefs -p devname if soft update journaling is enabled or
>> not.
> 
> In all cases I reported in earlier and now, softupdates ARE ENABLED on all
> partitions in question (always GPT, in my cases also all on flash based
> devices, SD card and/or SSD).
OK. That seems to be consistent. Here is the config I'm using on m-SATA SSDs
and I'm NOT experiencing the problem:

tunefs: POSIX.1e ACLs: (-a)disabled
tunefs: NFSv4 ACLs: (-N)   disabled
tunefs: MAC multilabel: (-l)   disabled
tunefs: soft updates: (-n) enabled
tunefs: soft update journaling: (-j)   disabled
tunefs: gjournal: (-J) disabled
tunefs: trim: (-t) enabled
tunefs: maximum blocks per file in a cylinder group: (-e)  4096
tunefs: average file size: (-f)16384
tunefs: average number of files in a directory: (-s)   64
tunefs: minimum percentage of free space: (-m) 8%
tunefs: space to hold for metadata blocks: (-k)6408
tunefs: optimization preference: (-o)  time
tunefs: volume label: (-L) 

This was the config I was experiencing the problem:

tunefs: POSIX.1e ACLs: (-a)disabled
tunefs: NFSv4 ACLs: (-N)   disabled
tunefs: MAC multilabel: (-l)   disabled
tunefs: soft updates: (-n) enabled
tunefs: soft update journaling: (-j)   enabled
tunefs: gjournal: (-J) disabled
tunefs: trim: (-t) enabled
tunefs: maximum blocks per file in a cylinder group: (-e)  4096
tunefs: average file size: (-f)16384
tunefs: average number of files in a directory: (-s)   64
tunefs: minimum percentage of free space: (-m) 8%
tunefs: space to hold for metadata blocks: (-k)6408
tunefs: optimization preference: (-o)  time
tunefs: volume label: (-L) 

So "soft updates" are enabled on both configs, but "soft update journaling" is 
different.

Maybe this helps in nailing down the problem.

Best regards
Michael

> 
>> 
>> Best regards
>> Michael
>>> 
>>> Warner
>>> ___
>>> freebsd-current@freebsd.org mailing list
>>> https://lists.freebsd.org/mailman/listinfo/freebsd-current
>>> To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"  
>> 
>> ___
>> freebsd-current@freebsd.org mailing list
>> 

Re: r327359: cylinder checksum failed: cg0, cgp: 0x4515d2a3 != bp: 0xd9fba319 Dec 30 23:29:24 <0.2>

2018-01-04 Thread O. Hartmann
On Thu, 4 Jan 2018 09:10:37 +0100
Michael Tuexen  wrote:

> > On 31. Dec 2017, at 02:45, Warner Losh  wrote:
> > 
> > On Sat, Dec 30, 2017 at 4:41 PM, O. Hartmann  wrote:
> >   
> >> On most recent CURRENT I face the error shwon below on /tmp filesystem
> >> (UFS2) residing
> >> on a Samsung 850 Pro SSD:
> >> 
> >> UFS /dev/gpt/tmp (/tmp) cylinder checksum failed: cg 0, cgp: 0x4515d2a3 !=
> >> bp: 0xd9fba319
> >> handle_workitem_freefile: got error 5 while accessing filesystem
> >> UFS /dev/gpt/tmp (/tmp) cylinder checksum failed: cg 0, cgp: 0x4515d2a3
> >> != bp: 0xd9fba319
> >> handle_workitem_freefile: got error 5 while accessing filesystem
> >> UFS /dev/gpt/tmp (/tmp) cylinder checksum failed: cg 0, cgp: 0x4515d2a3
> >> != bp: 0xd9fba319
> >> handle_workitem_freefile: got error 5 while accessing filesystem
> >> UFS /dev/gpt/tmp (/tmp) cylinder checksum failed: cg 0, cgp: 0x4515d2a3
> >> != bp: 0xd9fba319
> >> handle_workitem_freefile: got error 5 while accessing filesystem
> >> UFS /dev/gpt/tmp (/tmp) cylinder checksum failed: cg 0, cgp: 0x4515d2a3
> >> != bp: 0xd9fba319
> >> handle_workitem_freefile: got error 5 while accessing filesystem
> >> 
> >> I've already formatted the /tmp filesystem, but obviously without any
> >> success.
> >> 
> >> Since I face such strange errors also on NanoBSD images dd'ed to SD cards,
> >> I guess there
> >> is something fishy ...  
> > 
> > 
> > It indicates a problem. We've seen these 'corruptions' on data in motion at
> > work, but I hacked fsck to report checksum mismatches (it silently corrects
> > them today) and we've not seen any mismatch when we unmount and fsck the
> > filesystem.  
> Not sure this helps: But we have seen this also after system panics
> when having soft update journaling enabled. Having soft update journaling
> disabled, we do not observed this after several panics.
> Just to be clear: The panics are not related to this issue,
> but to other network development we do.
> 
> You can check using tunefs -p devname if soft update journaling is enabled or
> not.

In all cases I reported in earlier and now, softupdates ARE ENABLED on all
partitions in question (always GPT, in my cases also all on flash based
devices, SD card and/or SSD).


> 
> Best regards
> Michael
> > 
> > Warner
> > ___
> > freebsd-current@freebsd.org mailing list
> > https://lists.freebsd.org/mailman/listinfo/freebsd-current
> > To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"  
> 
> ___
> freebsd-current@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: r327359: cylinder checksum failed: cg0, cgp: 0x4515d2a3 != bp: 0xd9fba319 Dec 30 23:29:24 <0.2>

2018-01-04 Thread Michael Tuexen
> On 31. Dec 2017, at 02:45, Warner Losh  wrote:
> 
> On Sat, Dec 30, 2017 at 4:41 PM, O. Hartmann  wrote:
> 
>> On most recent CURRENT I face the error shwon below on /tmp filesystem
>> (UFS2) residing
>> on a Samsung 850 Pro SSD:
>> 
>> UFS /dev/gpt/tmp (/tmp) cylinder checksum failed: cg 0, cgp: 0x4515d2a3 !=
>> bp: 0xd9fba319
>> handle_workitem_freefile: got error 5 while accessing filesystem
>> UFS /dev/gpt/tmp (/tmp) cylinder checksum failed: cg 0, cgp: 0x4515d2a3
>> != bp: 0xd9fba319
>> handle_workitem_freefile: got error 5 while accessing filesystem
>> UFS /dev/gpt/tmp (/tmp) cylinder checksum failed: cg 0, cgp: 0x4515d2a3
>> != bp: 0xd9fba319
>> handle_workitem_freefile: got error 5 while accessing filesystem
>> UFS /dev/gpt/tmp (/tmp) cylinder checksum failed: cg 0, cgp: 0x4515d2a3
>> != bp: 0xd9fba319
>> handle_workitem_freefile: got error 5 while accessing filesystem
>> UFS /dev/gpt/tmp (/tmp) cylinder checksum failed: cg 0, cgp: 0x4515d2a3
>> != bp: 0xd9fba319
>> handle_workitem_freefile: got error 5 while accessing filesystem
>> 
>> I've already formatted the /tmp filesystem, but obviously without any
>> success.
>> 
>> Since I face such strange errors also on NanoBSD images dd'ed to SD cards,
>> I guess there
>> is something fishy ...
> 
> 
> It indicates a problem. We've seen these 'corruptions' on data in motion at
> work, but I hacked fsck to report checksum mismatches (it silently corrects
> them today) and we've not seen any mismatch when we unmount and fsck the
> filesystem.
Not sure this helps: But we have seen this also after system panics
when having soft update journaling enabled. Having soft update journaling
disabled, we do not observed this after several panics.
Just to be clear: The panics are not related to this issue,
but to other network development we do.

You can check using tunefs -p devname if soft update journaling is enabled or
not.

Best regards
Michael
> 
> Warner
> ___
> freebsd-current@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: r327359: cylinder checksum failed: cg0, cgp: 0x4515d2a3 != bp: 0xd9fba319 Dec 30 23:29:24 <0.2>

2017-12-30 Thread Warner Losh
On Sat, Dec 30, 2017 at 4:41 PM, O. Hartmann  wrote:

> On most recent CURRENT I face the error shwon below on /tmp filesystem
> (UFS2) residing
> on a Samsung 850 Pro SSD:
>
> UFS /dev/gpt/tmp (/tmp) cylinder checksum failed: cg 0, cgp: 0x4515d2a3 !=
> bp: 0xd9fba319
>  handle_workitem_freefile: got error 5 while accessing filesystem
>  UFS /dev/gpt/tmp (/tmp) cylinder checksum failed: cg 0, cgp: 0x4515d2a3
> != bp: 0xd9fba319
>  handle_workitem_freefile: got error 5 while accessing filesystem
>  UFS /dev/gpt/tmp (/tmp) cylinder checksum failed: cg 0, cgp: 0x4515d2a3
> != bp: 0xd9fba319
>  handle_workitem_freefile: got error 5 while accessing filesystem
>  UFS /dev/gpt/tmp (/tmp) cylinder checksum failed: cg 0, cgp: 0x4515d2a3
> != bp: 0xd9fba319
>  handle_workitem_freefile: got error 5 while accessing filesystem
>  UFS /dev/gpt/tmp (/tmp) cylinder checksum failed: cg 0, cgp: 0x4515d2a3
> != bp: 0xd9fba319
>  handle_workitem_freefile: got error 5 while accessing filesystem
>
> I've already formatted the /tmp filesystem, but obviously without any
> success.
>
> Since I face such strange errors also on NanoBSD images dd'ed to SD cards,
> I guess there
> is something fishy ...


It indicates a problem. We've seen these 'corruptions' on data in motion at
work, but I hacked fsck to report checksum mismatches (it silently corrects
them today) and we've not seen any mismatch when we unmount and fsck the
filesystem.

Warner
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


r327359: cylinder checksum failed: cg0, cgp: 0x4515d2a3 != bp: 0xd9fba319 Dec 30 23:29:24 <0.2>

2017-12-30 Thread O. Hartmann
On most recent CURRENT I face the error shwon below on /tmp filesystem (UFS2) 
residing
on a Samsung 850 Pro SSD:  

UFS /dev/gpt/tmp (/tmp) cylinder checksum failed: cg 0, cgp: 0x4515d2a3 != bp: 
0xd9fba319
 handle_workitem_freefile: got error 5 while accessing filesystem
 UFS /dev/gpt/tmp (/tmp) cylinder checksum failed: cg 0, cgp: 0x4515d2a3 != bp: 
0xd9fba319
 handle_workitem_freefile: got error 5 while accessing filesystem
 UFS /dev/gpt/tmp (/tmp) cylinder checksum failed: cg 0, cgp: 0x4515d2a3 != bp: 
0xd9fba319
 handle_workitem_freefile: got error 5 while accessing filesystem
 UFS /dev/gpt/tmp (/tmp) cylinder checksum failed: cg 0, cgp: 0x4515d2a3 != bp: 
0xd9fba319
 handle_workitem_freefile: got error 5 while accessing filesystem
 UFS /dev/gpt/tmp (/tmp) cylinder checksum failed: cg 0, cgp: 0x4515d2a3 != bp: 
0xd9fba319
 handle_workitem_freefile: got error 5 while accessing filesystem

I've already formatted the /tmp filesystem, but obviously without any success.

Since I face such strange errors also on NanoBSD images dd'ed to SD cards, I 
guess there
is something fishy ...

Kind regards,

Oliver

-- 
O. Hartmann

Ich widerspreche der Nutzung oder Übermittlung meiner Daten für
Werbezwecke oder für die Markt- oder Meinungsforschung (§ 28 Abs. 4 BDSG).


pgpfnWw0Hb5WM.pgp
Description: OpenPGP digital signature