Re: rpi2 hangup during poudriere build: lots of pfault wmseg status

2017-12-06 Thread Mark Millard
> On 2017-Dec-6, at 5:47 PM, Laurent Cimon  wrote:
> 
>> On Dec 6, 2017, at 20:01, Mark Millard  wrote:
>> 
>> On 2017-Dec-6, at 1:54 PM, Laurent Cimon  wrote:
>> 
 On Dec 6, 2017, at 00:57, Mark Millard  wrote:
 
 I tried to build some ports on a rpi2
 (via poudriere) but it hung up:
 Ethernet and normal console use. (Note:
 the root file system is on a USB SSD
 and the swap partition is also on that
 USB SSD.)
 
 But ~^b worked for getting to the db>
 prompt on the console.
 
 From there a ps suggests that it got hung
 up in pfault activity. (Possibly insufficient
 RAM+swap-partition space?) But it is not
 clear to me that it should end up hung up
 vs. killing processes or other such.
>>> 
>>> Hi,
>>> 
>>> From what I know the raspberry pis use the same controller for ethernet and
>>> the USB hub on which you’re hosting an SSD. It seems like you make very 
>>> heavy
>>> use of the USB ports, and all of the resources used by poudriere except for 
>>> the
>>> CPU and the (very limited) memory that’s not in swap is attached to them. 
>>> If you
>>> really didn’t have enough memory and swap, the linkers would’ve been 
>>> stopped.
>>> 
>>> I think it might just be a swap death. Poudriere compiles and fetches in 
>>> parallel
>>> a lot, ethernet and disk I/O is slow because it’s very limited, so linking 
>>> takes
>>> longer. You end up linking a few very big binaries at the same time, and 
>>> they
>>> all fight for the memory, to get out of swap through page faults, but there
>>> are too many page faults, all too big, requesting for more CPU time that’s
>>> allowed to them.
>>> 
>>> This would explain why you have 3 linkers waiting on a page fault out of 
>>> the 4
>>> CPUs poudriere allows builds on, on top of the awk processes. It would also
>>> explain why you had easy access to the debugger: it was in memory already 
>>> with
>>> the kernel.
>>> 
>>> I’d advise you to disable parallel builds and see if it happens again,
>>> but it would make building much slower. Using makejobs would help if you
>>> can afford watching the build. Otherwise be patient, it should resolve 
>>> itself
>>> eventually, but it will take a while and it will happen again.
>> 
>> My post was more about how FreeBSD handled the
>> heavy-use context and less about getting the
>> builds to finish: it managed to to get to a
>> state of no-progress for processes and a loss
>> of normal control as far as I could tell.
>> 
>> I did a "c" to ddb and left it until just before
>> this note then did ~ ^B again. Things looked the
>> same. [I've finally rebooted the rpi2.]
>> 
>> PARALLEL_JOBS=1 was already in use but
>> ALLOW_MAKE_JOBS=yes was also in use.
>> USE_TMPFS=no was already in use.
>> 
>> While an ssh session was monitoring the
>> build, Ethernet was not in heavy use.
>> (No nfs mounts to its disks, for example.)
>> 
>> I may try without ALLOW_MAKE_JOBS=yes and
>> with ALLOW_MAKE_JOBS_PACKAGES empty/undefined
>> to see if it can complete for such a context
>> without having the same sort of problem.
>> 
>> Ultimately I can cross-build and install from
>> those materials when I really want updates. I
>> have the context for such. This was more about
>> seeing how well the rpi2 did for self-hosted.
>> Classically I've used a BPI-M3 with 2 GiBytes
>> of RAM and a proportionally bigger swap partition
>> instead (approximately).
>> 
>> 
>> FYI (rpi2 after rebooting):
>> 
>> # swapinfo
>> Device  1K-blocks UsedAvail Capacity
>> /dev/label/RPI2swap   15728600  1572860 0%
>> 
>> # df -m
>> Filesystem   1M-blocks  Used  Avail Capacity  Mounted on
>> /dev/ufs/RPI2rootfs 195378 30791 14895717%/
>> devfs0 0  0   100%/dev
>> /dev/label/RPI2Aboot4912 3725%/boot/msdos
>> 
>> 
>> An rpi3 (aarch64) with the same amount of RAM,
>> same type of USB SSD, etc., but well more swap
>> completed building basically the same set of
>> ports for the same poudriere settings just
>> fine.
>> 
>> Interestingly for the default kern.maxswzone:
>> (Just to show the reported recommended maximum
>> figures for swap.)
>> 
>> rpi2: . . . exceeds maximum recommended amount (411488 pages).
>> rpi3: . . . exceeds maximum recommended amount (925680 pages).
>> 
>> (I was running with somewhat under those maximums for
>> the tests.)
>> 
>> # swapinfo
>> Device  1K-blocks UsedAvail Capacity
>> /dev/gpt/RPI3swap   37027840  3702784 0%
>> 
>> # df -m
>> Filesystem   1M-blocks  Used  Avail Capacity  Mounted on
>> /dev/ufs/RPI3rootfs 195378 14937 164811 8%/
>> devfs0 0  0   100%/dev
>> /dev/label/RPI3Aboot49 7 4215%/boot/efi
>> 
>> If I restricted the rpi3 to somewhat under what the
>> rpi2 allows for swap, I do not know if it would also
>> hang up vs. not.
>> 
>> If having more swap makes the differe

Re: rpi2 hangup during poudriere build: lots of pfault wmseg status

2017-12-06 Thread Laurent Cimon
> On Dec 6, 2017, at 20:01, Mark Millard  wrote:
> 
> On 2017-Dec-6, at 1:54 PM, Laurent Cimon  wrote:
> 
>>> On Dec 6, 2017, at 00:57, Mark Millard  wrote:
>>> 
>>> I tried to build some ports on a rpi2
>>> (via poudriere) but it hung up:
>>> Ethernet and normal console use. (Note:
>>> the root file system is on a USB SSD
>>> and the swap partition is also on that
>>> USB SSD.)
>>> 
>>> But ~^b worked for getting to the db>
>>> prompt on the console.
>>> 
>>> From there a ps suggests that it got hung
>>> up in pfault activity. (Possibly insufficient
>>> RAM+swap-partition space?) But it is not
>>> clear to me that it should end up hung up
>>> vs. killing processes or other such.
>> 
>> Hi,
>> 
>> From what I know the raspberry pis use the same controller for ethernet and
>> the USB hub on which you’re hosting an SSD. It seems like you make very heavy
>> use of the USB ports, and all of the resources used by poudriere except for 
>> the
>> CPU and the (very limited) memory that’s not in swap is attached to them. If 
>> you
>> really didn’t have enough memory and swap, the linkers would’ve been stopped.
>> 
>> I think it might just be a swap death. Poudriere compiles and fetches in 
>> parallel
>> a lot, ethernet and disk I/O is slow because it’s very limited, so linking 
>> takes
>> longer. You end up linking a few very big binaries at the same time, and they
>> all fight for the memory, to get out of swap through page faults, but there
>> are too many page faults, all too big, requesting for more CPU time that’s
>> allowed to them.
>> 
>> This would explain why you have 3 linkers waiting on a page fault out of the 
>> 4
>> CPUs poudriere allows builds on, on top of the awk processes. It would also
>> explain why you had easy access to the debugger: it was in memory already 
>> with
>> the kernel.
>> 
>> I’d advise you to disable parallel builds and see if it happens again,
>> but it would make building much slower. Using makejobs would help if you
>> can afford watching the build. Otherwise be patient, it should resolve itself
>> eventually, but it will take a while and it will happen again.
> 
> My post was more about how FreeBSD handled the
> heavy-use context and less about getting the
> builds to finish: it managed to to get to a
> state of no-progress for processes and a loss
> of normal control as far as I could tell.
> 
> I did a "c" to ddb and left it until just before
> this note then did ~ ^B again. Things looked the
> same. [I've finally rebooted the rpi2.]
> 
> PARALLEL_JOBS=1 was already in use but
> ALLOW_MAKE_JOBS=yes was also in use.
> USE_TMPFS=no was already in use.
> 
> While an ssh session was monitoring the
> build, Ethernet was not in heavy use.
> (No nfs mounts to its disks, for example.)
> 
> I may try without ALLOW_MAKE_JOBS=yes and
> with ALLOW_MAKE_JOBS_PACKAGES empty/undefined
> to see if it can complete for such a context
> without having the same sort of problem.
> 
> Ultimately I can cross-build and install from
> those materials when I really want updates. I
> have the context for such. This was more about
> seeing how well the rpi2 did for self-hosted.
> Classically I've used a BPI-M3 with 2 GiBytes
> of RAM and a proportionally bigger swap partition
> instead (approximately).
> 
> 
> FYI (rpi2 after rebooting):
> 
> # swapinfo
> Device  1K-blocks UsedAvail Capacity
> /dev/label/RPI2swap   15728600  1572860 0%
> 
> # df -m
> Filesystem   1M-blocks  Used  Avail Capacity  Mounted on
> /dev/ufs/RPI2rootfs 195378 30791 14895717%/
> devfs0 0  0   100%/dev
> /dev/label/RPI2Aboot4912 3725%/boot/msdos
> 
> 
> An rpi3 (aarch64) with the same amount of RAM,
> same type of USB SSD, etc., but well more swap
> completed building basically the same set of
> ports for the same poudriere settings just
> fine.
> 
> Interestingly for the default kern.maxswzone:
> (Just to show the reported recommended maximum
> figures for swap.)
> 
> rpi2: . . . exceeds maximum recommended amount (411488 pages).
> rpi3: . . . exceeds maximum recommended amount (925680 pages).
> 
> (I was running with somewhat under those maximums for
> the tests.)
> 
> # swapinfo
> Device  1K-blocks UsedAvail Capacity
> /dev/gpt/RPI3swap   37027840  3702784 0%
> 
> # df -m
> Filesystem   1M-blocks  Used  Avail Capacity  Mounted on
> /dev/ufs/RPI3rootfs 195378 14937 164811 8%/
> devfs0 0  0   100%/dev
> /dev/label/RPI3Aboot49 7 4215%/boot/efi
> 
> If I restricted the rpi3 to somewhat under what the
> rpi2 allows for swap, I do not know if it would also
> hang up vs. not.
> 
> If having more swap makes the difference, then it
> would not seem to be being I/O-bound that would
> explain the hangup.
> 
> 
> ===
> Mark Millard
> markmi at dsl-only.net

There are a few factors that could have prevented this on your ra

Re: rpi2 hangup during poudriere build: lots of pfault wmseg status

2017-12-06 Thread Mark Millard
On 2017-Dec-6, at 1:54 PM, Laurent Cimon  wrote:

>> On Dec 6, 2017, at 00:57, Mark Millard  wrote:
>> 
>> I tried to build some ports on a rpi2
>> (via poudriere) but it hung up:
>> Ethernet and normal console use. (Note:
>> the root file system is on a USB SSD
>> and the swap partition is also on that
>> USB SSD.)
>> 
>> But ~^b worked for getting to the db>
>> prompt on the console.
>> 
>> From there a ps suggests that it got hung
>> up in pfault activity. (Possibly insufficient
>> RAM+swap-partition space?) But it is not
>> clear to me that it should end up hung up
>> vs. killing processes or other such.
> 
> Hi,
> 
> From what I know the raspberry pis use the same controller for ethernet and
> the USB hub on which you’re hosting an SSD. It seems like you make very heavy
> use of the USB ports, and all of the resources used by poudriere except for 
> the
> CPU and the (very limited) memory that’s not in swap is attached to them. If 
> you
> really didn’t have enough memory and swap, the linkers would’ve been stopped.
> 
> I think it might just be a swap death. Poudriere compiles and fetches in 
> parallel
> a lot, ethernet and disk I/O is slow because it’s very limited, so linking 
> takes
> longer. You end up linking a few very big binaries at the same time, and they
> all fight for the memory, to get out of swap through page faults, but there
> are too many page faults, all too big, requesting for more CPU time that’s
> allowed to them.
> 
> This would explain why you have 3 linkers waiting on a page fault out of the 4
> CPUs poudriere allows builds on, on top of the awk processes. It would also
> explain why you had easy access to the debugger: it was in memory already with
> the kernel.
> 
> I’d advise you to disable parallel builds and see if it happens again,
> but it would make building much slower. Using makejobs would help if you
> can afford watching the build. Otherwise be patient, it should resolve itself
> eventually, but it will take a while and it will happen again.

My post was more about how FreeBSD handled the
heavy-use context and less about getting the
builds to finish: it managed to to get to a
state of no-progress for processes and a loss
of normal control as far as I could tell.

I did a "c" to ddb and left it until just before
this note then did ~ ^B again. Things looked the
same. [I've finally rebooted the rpi2.]

PARALLEL_JOBS=1 was already in use but
ALLOW_MAKE_JOBS=yes was also in use.
USE_TMPFS=no was already in use.

While an ssh session was monitoring the
build, Ethernet was not in heavy use.
(No nfs mounts to its disks, for example.)

I may try without ALLOW_MAKE_JOBS=yes and
with ALLOW_MAKE_JOBS_PACKAGES empty/undefined
to see if it can complete for such a context
without having the same sort of problem.

Ultimately I can cross-build and install from
those materials when I really want updates. I
have the context for such. This was more about
seeing how well the rpi2 did for self-hosted.
Classically I've used a BPI-M3 with 2 GiBytes
of RAM and a proportionally bigger swap partition
instead (approximately).


FYI (rpi2 after rebooting):

# swapinfo
Device  1K-blocks UsedAvail Capacity
/dev/label/RPI2swap   15728600  1572860 0%

# df -m
Filesystem   1M-blocks  Used  Avail Capacity  Mounted on
/dev/ufs/RPI2rootfs 195378 30791 14895717%/
devfs0 0  0   100%/dev
/dev/label/RPI2Aboot4912 3725%/boot/msdos


An rpi3 (aarch64) with the same amount of RAM,
same type of USB SSD, etc., but well more swap
completed building basically the same set of
ports for the same poudriere settings just
fine.

Interestingly for the default kern.maxswzone:
(Just to show the reported recommended maximum
figures for swap.)

rpi2: . . . exceeds maximum recommended amount (411488 pages).
rpi3: . . . exceeds maximum recommended amount (925680 pages).

(I was running with somewhat under those maximums for
the tests.)

# swapinfo
Device  1K-blocks UsedAvail Capacity
/dev/gpt/RPI3swap   37027840  3702784 0%

# df -m
Filesystem   1M-blocks  Used  Avail Capacity  Mounted on
/dev/ufs/RPI3rootfs 195378 14937 164811 8%/
devfs0 0  0   100%/dev
/dev/label/RPI3Aboot49 7 4215%/boot/efi

If I restricted the rpi3 to somewhat under what the
rpi2 allows for swap, I do not know if it would also
hang up vs. not.

If having more swap makes the difference, then it
would not seem to be being I/O-bound that would
explain the hangup.


===
Mark Millard
markmi at dsl-only.net

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: rpi2 hangup during poudriere build: lots of pfault wmseg status

2017-12-06 Thread Laurent Cimon
> On Dec 6, 2017, at 00:57, Mark Millard  wrote:
> 
> I tried to build some ports on a rpi2
> (via poudriere) but it hung up:
> Ethernet and normal console use. (Note:
> the root file system is on a USB SSD
> and the swap partition is also on that
> USB SSD.)
> 
> But ~^b worked for getting to the db>
> prompt on the console.
> 
> From there a ps suggests that it got hung
> up in pfault activity. (Possibly insufficient
> RAM+swap-partition space?) But it is not
> clear to me that it should end up hung up
> vs. killing processes or other such.

Hi,

From what I know the raspberry pis use the same controller for ethernet and
the USB hub on which you’re hosting an SSD. It seems like you make very heavy
use of the USB ports, and all of the resources used by poudriere except for the
CPU and the (very limited) memory that’s not in swap is attached to them. If you
really didn’t have enough memory and swap, the linkers would’ve been stopped.

I think it might just be a swap death. Poudriere compiles and fetches in 
parallel
a lot, ethernet and disk I/O is slow because it’s very limited, so linking takes
longer. You end up linking a few very big binaries at the same time, and they
all fight for the memory, to get out of swap through page faults, but there
are too many page faults, all too big, requesting for more CPU time that’s
allowed to them.

This would explain why you have 3 linkers waiting on a page fault out of the 4
CPUs poudriere allows builds on, on top of the awk processes. It would also
explain why you had easy access to the debugger: it was in memory already with
the kernel.

I’d advise you to disable parallel builds and see if it happens again,
but it would make building much slower. Using makejobs would help if you
can afford watching the build. Otherwise be patient, it should resolve itself
eventually, but it will take a while and it will happen again.

Good luck,

Laurent
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"