Replying to myself again, I again doubled the bio_transient_maxcnt: original value 160, failed doubling 360, new value 720; and the machine was able to successfully "for i in jot 10; do make -j4 buildkernel; done" ...
But doesn't this mean that we still have a resource exhaustion to worry about? Isn't this just another race waiting for the the right set of conditions? On Tue, Sep 3, 2013 at 11:06 AM, Zaphod Beeblebrox <[email protected]>wrote: > Since there weren't any more ideas here, I tried turning off > hyper-threading. This is an old pentium-D type CPU --- that is: one core > with HT. I'm wondering if the HT nature is helping this resource > exhaustion, so I turned off HT (basically making this a single-threaded > CPU) and it seems to have made the problem go away. > > That is not to say that the problem is fixed: it simply means that > replication may be tied to multiple CPUs and/or the allocation of resources > by an HT CPU core. > > > On Mon, Sep 2, 2013 at 3:53 AM, Zaphod Beeblebrox <[email protected]>wrote: > >> The first one (kern.geom.transient_map_retries) causes the system to >> wedge. >> >> The second one (default is 180, I doubled to 360) causes the system to >> crash but not dump. >> >> So... neither fixes the problem. >> >> >> On Sat, Aug 31, 2013 at 5:27 AM, Edward Tomasz Napierała < >> [email protected]> wrote: >> >>> Wiadomość napisana przez Zaphod Beeblebrox <[email protected]> w dniu >>> 31 sie 2013, o godz. 00:49: >>> > Because someone said that there would be no logging of unerlying ATA >>> errors without verbose, I rebooted with verbose and tried the same make -j4 >>> again... and here is the relatively similar core.txt.5 >>> > >>> > >>> https://uk.eicat.ca/owncloud/public.php?service=files&t=d99648ef5876b91c5957148445e60c87 >>> > >>> > Looking at it, gmirror is dropping the same error and the underlying >>> hardware is not causing the error... >>> >>> Let me quote Konstantin: >>> >>> > It is either an exhaustion of the transient map, or a deadlock. >>> > For the first, setting kern.geom.transient_map_retries to 0 could help. >>> > For the second, the count of the transient buffers must be increased, >>> > by kern.bio_transient_maxcnt loader tunable. >>> >>> Could you try both and tell which one of them fixed the problem? Thanks! >>> >>> >> > _______________________________________________ [email protected] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[email protected]"
