Re: vnode_pager_generic_getpages_done: I/O read error 5 caused by r318394 (was Re: FreeBSD 11.1-BETA1 Now Available)

2017-06-26 Thread Marcelo Araujo
2017-06-27 4:18 GMT+08:00 Guido Falsi :

> On 06/26/17 16:36, Marcelo Araujo wrote:
>
> Hi,
>>
>> Could you guys test this patch: https://reviews.freebsd.org/D11365?
>> Would it solve the issue?
>>
>>
> Hi,
>
> I confirm the patch fixes the problem for me.
>
> Thanks!
>
> --
> Guido Falsi 
>

Thanks all for the test, very appreciated!
I just committed it: r320390 with MFC for 3 days.

Also thanks trasz@ to point me out to this thread that I was not aware of.


Best,
-- 

-- 
Marcelo Araujo(__)ara...@freebsd.org
\\\'',)http://www.FreeBSD.org    \/  \ ^
Power To Server. .\. /_)
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: vnode_pager_generic_getpages_done: I/O read error 5 caused by r318394 (was Re: FreeBSD 11.1-BETA1 Now Available)

2017-06-26 Thread Guido Falsi

On 06/26/17 16:36, Marcelo Araujo wrote:


Hi,

Could you guys test this patch: https://reviews.freebsd.org/D11365?
Would it solve the issue?



Hi,

I confirm the patch fixes the problem for me.

Thanks!

--
Guido Falsi 
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: vnode_pager_generic_getpages_done: I/O read error 5 caused by r318394 (was Re: FreeBSD 11.1-BETA1 Now Available)

2017-06-26 Thread Peter Blok
Marcelo,

This fix solved the problem for RPI-1B. I’ll do some more testing on other RPI 
and nanobsd variants.

Peter

> On 26 Jun 2017, at 16:36, Marcelo Araujo  wrote:
> 
> 
> 
> 2017-06-23 4:02 GMT+08:00 Guido Falsi  >:
> On 06/22/17 19:06, Guido Falsi wrote:
> On 06/22/17 18:38, Warner Losh wrote:
> 
> I'll followup as soon as I have easier use case to reproduce it. I first need 
> to revert to an image affected by the problem.
> 
> I have made a few more tests.
> 
> I am able to trigger this bug easily by running gpart.
> 
> I'm testing on a PCEngines APU2 board with SD memory card.
> 
> # gpart set -a active -i 1 mmcsd0
> active set on mmcsd0s1
> # fsck_ffs -n /dev/mmcsd0s1a
> ** /dev/mmcsd0s1a (NO WRITE)
> ** Last Mounted on /mnt
> ** Phase 1 - Check Blocks and Sizes
> ** Phase 2 - Check Pathnames
> Segmentation fault
> # shutdown -r now
> /sbin/shutdown: Device not configured
> 
> also, if I open another shell I can't perform many other operations which are 
> not failing in the previous root shell:
> 
> > tail /var/log/messages
> /usr/bin/tail: Device not configured.
> 
> 
> BTW while testing this multiple times I also had the root shell segfault 
> while browsing history, so it should be quite easy to reproduce on your side 
> too. running the gpart set command triggers it every time, with slightly 
> different bu always disruptive symptoms.
> 
> There is a chance it only shows with these embedded systems storage 
> controllers though.
> 
> -- 
> Guido Falsi 
> ___
> freebsd...@freebsd.org  mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-fs 
> 
> To unsubscribe, send any mail to "freebsd-fs-unsubscr...@freebsd.org 
> "
> 
> 
> Hi,
> 
> Could you guys test this patch: https://reviews.freebsd.org/D11365 
> ?
> Would it solve the issue?
> 
> Best,
> -- 
> 
> -- 
> Marcelo Araujo(__)
> ara...@freebsd.org  \\\'',)
> http://www.FreeBSD.org    \/  \ ^
> Power To Server. .\. /_)

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: vnode_pager_generic_getpages_done: I/O read error 5 caused by r318394 (was Re: FreeBSD 11.1-BETA1 Now Available)

2017-06-26 Thread Marcelo Araujo
2017-06-23 4:02 GMT+08:00 Guido Falsi :

> On 06/22/17 19:06, Guido Falsi wrote:
>
>> On 06/22/17 18:38, Warner Losh wrote:
>>
>
> I'll followup as soon as I have easier use case to reproduce it. I first
>> need to revert to an image affected by the problem.
>>
>
> I have made a few more tests.
>
> I am able to trigger this bug easily by running gpart.
>
> I'm testing on a PCEngines APU2 board with SD memory card.
>
> # gpart set -a active -i 1 mmcsd0
> active set on mmcsd0s1
> # fsck_ffs -n /dev/mmcsd0s1a
> ** /dev/mmcsd0s1a (NO WRITE)
> ** Last Mounted on /mnt
> ** Phase 1 - Check Blocks and Sizes
> ** Phase 2 - Check Pathnames
> Segmentation fault
> # shutdown -r now
> /sbin/shutdown: Device not configured
>
> also, if I open another shell I can't perform many other operations which
> are not failing in the previous root shell:
>
> > tail /var/log/messages
> /usr/bin/tail: Device not configured.
>
>
> BTW while testing this multiple times I also had the root shell segfault
> while browsing history, so it should be quite easy to reproduce on your
> side too. running the gpart set command triggers it every time, with
> slightly different bu always disruptive symptoms.
>
> There is a chance it only shows with these embedded systems storage
> controllers though.
>
> --
> Guido Falsi 
> ___
> freebsd...@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscr...@freebsd.org"
>


Hi,

Could you guys test this patch: https://reviews.freebsd.org/D11365?
Would it solve the issue?

Best,
-- 

-- 
Marcelo Araujo(__)ara...@freebsd.org
\\\'',)http://www.FreeBSD.org    \/  \ ^
Power To Server. .\. /_)
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: vnode_pager_generic_getpages_done: I/O read error 5 caused by r318394 (was Re: FreeBSD 11.1-BETA1 Now Available)

2017-06-22 Thread Guido Falsi

On 06/22/17 19:06, Guido Falsi wrote:

On 06/22/17 18:38, Warner Losh wrote:


I'll followup as soon as I have easier use case to reproduce it. I first 
need to revert to an image affected by the problem.


I have made a few more tests.

I am able to trigger this bug easily by running gpart.

I'm testing on a PCEngines APU2 board with SD memory card.

# gpart set -a active -i 1 mmcsd0
active set on mmcsd0s1
# fsck_ffs -n /dev/mmcsd0s1a
** /dev/mmcsd0s1a (NO WRITE)
** Last Mounted on /mnt
** Phase 1 - Check Blocks and Sizes
** Phase 2 - Check Pathnames
Segmentation fault
# shutdown -r now
/sbin/shutdown: Device not configured

also, if I open another shell I can't perform many other operations 
which are not failing in the previous root shell:


> tail /var/log/messages
/usr/bin/tail: Device not configured.


BTW while testing this multiple times I also had the root shell segfault 
while browsing history, so it should be quite easy to reproduce on your 
side too. running the gpart set command triggers it every time, with 
slightly different bu always disruptive symptoms.


There is a chance it only shows with these embedded systems storage 
controllers though.

--
Guido Falsi 
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: vnode_pager_generic_getpages_done: I/O read error 5 caused by r318394 (was Re: FreeBSD 11.1-BETA1 Now Available)

2017-06-22 Thread Guido Falsi

On 06/22/17 18:38, Warner Losh wrote:



On Thu, Jun 22, 2017 at 2:26 AM, Guido Falsi > wrote:


On 06/21/17 16:59, Guido Falsi wrote:
> On 06/13/17 13:44, Peter Blok wrote:
>> Hi,
>>
>> For a while now, I’m not able to build a RPI1-B image from -stable. I 
have narrowed it dow to fix 318394, which adds a refresh option to geom_label. If I 
undo this fix in today’s stable it works ok. If I don’t I’m getting continuously:
>>
>> vm_fault: pager read error, pid 1 (init)
>> vnode_pager_generic_getpages_done: I/O read error 5
>>
>> I have looked at the fix and I can’t figure out why it breaks the code.
>>
>> And yes I have tried various other SD cards - they all have the same 
issue.
>>
>
> Hi,
>
> I'm seeing similar symptoms with NanoBSD images on PCEngines ALIX and
> APU2 boards, using compactflash and SD card storage respectively. The
> problem has appeared as soon as I started testing 11.1-BETA1 from the
> stable branch.
>
> Problem appears when I update the image, using a slightly modified
> version of the standard nanobsd update and updatep[12] scripts. My
> changes are not in the dd/gpart commands though, which are the same.
> gpart seems the most likely candidate though.
>
> I have just discovered this thread and I will test reverting r318394
> soon. Thanks to Peter for narrowing it down!
>
> Maybe this is related to having the disks mounted read-only?
>

I noticed that after the problem appears many commands, including
shutdown, start failing telling "device not configured" for all mounted
FSes. I'm even unable to "ls /dev".

Looks like the geom refresh changes devices from below the system in a
way which triggers this reaction.

I don't know the geom code and have been unable to find an immediate
problem in the commit mentioned above. I'd really like some help to know
where to look, or what kind of debugging information is needed.

This is quite a bad bug for people running NanoBSD and should be fixed
before the release.


So can I recreate this with the embedded-type NanoBSD image? If this 
change breaks NanoBSD, it will need to be reverted...




You should be able to reproduce it with a nanobsd image, then updating 
it using the standard script which dumps the new image in the "other" 
partition and uses gpart to configure the new partition as bootable.


I'm using a slightly modified update script which also mounts the new 
partition in /mnt  and performs some operations there. Then it dismounts 
the partition and launches the "gpart set -a active -i ${_to} 
${NANO_DRIVE}" command (which I suspect is exactly where the actual 
problem is happening).


I also tested reverting the change and can confirm that it makes the 
problem go away.


I'm sure it can be triggered by other gpart operations. I'm trying to 
understand exactly which operations.


I'll followup as soon as I have easier use case to reproduce it. I first 
need to revert to an image affected by the problem.


Thanks for your feedback!

--
Guido Falsi 
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: vnode_pager_generic_getpages_done: I/O read error 5 caused by r318394 (was Re: FreeBSD 11.1-BETA1 Now Available)

2017-06-22 Thread Warner Losh
On Thu, Jun 22, 2017 at 2:26 AM, Guido Falsi  wrote:

> On 06/21/17 16:59, Guido Falsi wrote:
> > On 06/13/17 13:44, Peter Blok wrote:
> >> Hi,
> >>
> >> For a while now, I’m not able to build a RPI1-B image from -stable. I
> have narrowed it dow to fix 318394, which adds a refresh option to
> geom_label. If I undo this fix in today’s stable it works ok. If I don’t
> I’m getting continuously:
> >>
> >> vm_fault: pager read error, pid 1 (init)
> >> vnode_pager_generic_getpages_done: I/O read error 5
> >>
> >> I have looked at the fix and I can’t figure out why it breaks the code.
> >>
> >> And yes I have tried various other SD cards - they all have the same
> issue.
> >>
> >
> > Hi,
> >
> > I'm seeing similar symptoms with NanoBSD images on PCEngines ALIX and
> > APU2 boards, using compactflash and SD card storage respectively. The
> > problem has appeared as soon as I started testing 11.1-BETA1 from the
> > stable branch.
> >
> > Problem appears when I update the image, using a slightly modified
> > version of the standard nanobsd update and updatep[12] scripts. My
> > changes are not in the dd/gpart commands though, which are the same.
> > gpart seems the most likely candidate though.
> >
> > I have just discovered this thread and I will test reverting r318394
> > soon. Thanks to Peter for narrowing it down!
> >
> > Maybe this is related to having the disks mounted read-only?
> >
>
> I noticed that after the problem appears many commands, including
> shutdown, start failing telling "device not configured" for all mounted
> FSes. I'm even unable to "ls /dev".
>
> Looks like the geom refresh changes devices from below the system in a
> way which triggers this reaction.
>
> I don't know the geom code and have been unable to find an immediate
> problem in the commit mentioned above. I'd really like some help to know
> where to look, or what kind of debugging information is needed.
>
> This is quite a bad bug for people running NanoBSD and should be fixed
> before the release.
>

So can I recreate this with the embedded-type NanoBSD image? If this change
breaks NanoBSD, it will need to be reverted...

Warner
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: vnode_pager_generic_getpages_done: I/O read error 5 caused by r318394 (was Re: FreeBSD 11.1-BETA1 Now Available)

2017-06-22 Thread Guido Falsi
On 06/21/17 16:59, Guido Falsi wrote:
> On 06/13/17 13:44, Peter Blok wrote:
>> Hi,
>>
>> For a while now, I’m not able to build a RPI1-B image from -stable. I have 
>> narrowed it dow to fix 318394, which adds a refresh option to geom_label. If 
>> I undo this fix in today’s stable it works ok. If I don’t I’m getting 
>> continuously:
>>
>> vm_fault: pager read error, pid 1 (init)
>> vnode_pager_generic_getpages_done: I/O read error 5
>>
>> I have looked at the fix and I can’t figure out why it breaks the code.
>>
>> And yes I have tried various other SD cards - they all have the same issue.
>>
> 
> Hi,
> 
> I'm seeing similar symptoms with NanoBSD images on PCEngines ALIX and
> APU2 boards, using compactflash and SD card storage respectively. The
> problem has appeared as soon as I started testing 11.1-BETA1 from the
> stable branch.
> 
> Problem appears when I update the image, using a slightly modified
> version of the standard nanobsd update and updatep[12] scripts. My
> changes are not in the dd/gpart commands though, which are the same.
> gpart seems the most likely candidate though.
> 
> I have just discovered this thread and I will test reverting r318394
> soon. Thanks to Peter for narrowing it down!
> 
> Maybe this is related to having the disks mounted read-only?
> 

I noticed that after the problem appears many commands, including
shutdown, start failing telling "device not configured" for all mounted
FSes. I'm even unable to "ls /dev".

Looks like the geom refresh changes devices from below the system in a
way which triggers this reaction.

I don't know the geom code and have been unable to find an immediate
problem in the commit mentioned above. I'd really like some help to know
where to look, or what kind of debugging information is needed.

This is quite a bad bug for people running NanoBSD and should be fixed
before the release.

Thanks in advance!

-- 
Guido Falsi 
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

vnode_pager_generic_getpages_done: I/O read error 5 caused by r318394 (was Re: FreeBSD 11.1-BETA1 Now Available)

2017-06-21 Thread Guido Falsi
On 06/13/17 13:44, Peter Blok wrote:
> Hi,
> 
> For a while now, I’m not able to build a RPI1-B image from -stable. I have 
> narrowed it dow to fix 318394, which adds a refresh option to geom_label. If 
> I undo this fix in today’s stable it works ok. If I don’t I’m getting 
> continuously:
> 
> vm_fault: pager read error, pid 1 (init)
> vnode_pager_generic_getpages_done: I/O read error 5
> 
> I have looked at the fix and I can’t figure out why it breaks the code.
> 
> And yes I have tried various other SD cards - they all have the same issue.
> 

Hi,

I'm seeing similar symptoms with NanoBSD images on PCEngines ALIX and
APU2 boards, using compactflash and SD card storage respectively. The
problem has appeared as soon as I started testing 11.1-BETA1 from the
stable branch.

Problem appears when I update the image, using a slightly modified
version of the standard nanobsd update and updatep[12] scripts. My
changes are not in the dd/gpart commands though, which are the same.
gpart seems the most likely candidate though.

I have just discovered this thread and I will test reverting r318394
soon. Thanks to Peter for narrowing it down!

Maybe this is related to having the disks mounted read-only?

Should I open a bug report to properly track this down?

P.S.  CCing freebsd-fs@

-- 
Guido Falsi 
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"