Re: librt symbol versioning breakage (was Re: Build failure, undefined reference to __mq_oshandle)

2015-12-04 Thread Konstantin Belousov
On Thu, Dec 03, 2015 at 10:34:10PM -0500, Daniel Eischen wrote:
> On Mon, 30 Nov 2015, Konstantin Belousov wrote:
> 
> > On Sun, Nov 29, 2015 at 12:27:40PM -0500, Daniel Eischen wrote:
> >> On Sun, 29 Nov 2015, Konstantin Belousov wrote:
> >>
> >>> On Sun, Nov 29, 2015 at 01:23:04AM -0500, Daniel Eischen wrote:
> 
>  So I found out that sometime in the last year or so, symbol versioning
>  for librt was broken and leaking symbols that shouldn't have been
>  leaked.  I've just committed a fix for this.
> 
>  Do a 'readelf -sw /usr/lib/librt.so.1 | grep GLOBAL | grep -v UND'
>  and see the non FBSD_foo symbols that shouldn't be there.
> >>>
> >>> I did the following on the librt from the HEAD of about month ago:
> >>>
> >>> pooma% ls -l netboot/sandy/usr/lib/librt.so.1
> >>> -r--r--r--  1 root  wheel  23704 Oct 24 23:35 
> >>> netboot/sandy/usr/lib/librt.so.1
> >>>
> >>> pooma% readelf -sw netboot/sandy/usr/lib/librt.so.1 | grep GLOBAL | grep 
> >>> -v UND | grep -v FBSDpriv | grep FBSD
> >>>97:  0 OBJECT  GLOBAL DEFAULT  ABS FBSD_1.0
> >>>
> >>> But I think that your commit is the good change.
> >>
> >>
> >> Can you check librt.so.1 in your buildworld temporary build
> >> environment?
> >>
> >>$ readelf -sW /usr/obj/usr/FreeBSD/svn/src/tmp/usr/lib/librt.so.1 \
> >>   grep -v UND | grep GLOBAL
> >>
> >> There has to be a reason that tests/sys/mqueue/mqtest[3-4].c build
> >> without error when they reference __mq_oshandle.  That symbol is
> >> not exported from librt.
> >>
> >> Hmm, looks like libc and libthr are also the same (leaky) in the
> >> temporary build environment (TBE).  So something broke when building
> >> the TBE libraries.
> >>
> >> For r277320 on my system Jan 19, 2015, the TBE libraries must have
> >> been built correctly because mqtests[3-4] failed with unresolved
> >> references to __mq_oshandle.
> >
> > In fact, my command was wrong.  I see that there are indeed a lot of symbols
> > exported which are not versioned (this is where my command was wrong, the
> > grep FBSD part), even for the installed librt.
> 
> Arrgh, this bug also seems to have slipped into 10-stable.
> 
> $ uname -a
> FreeBSD  10.1-RELEASE-p9 FreeBSD 10.1-RELEASE-p9 #0: Tue Apr  7 
> 01:09:46 UTC 2015 
> r...@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC  amd64
> $ readelf -sW /usr/lib/librt.so.1 | grep -v UND | grep GLOBAL \
>| grep -v FBSD
>  71: 3ac0   432 FUNCGLOBAL DEFAULT   12 __sigev_alloc
>  ...
> $ readelf -sW /usr/lib/librt.so.1 | grep -v UND | grep GLOBAL \
>| grep -v FBSD | wc -l
>22

Yes, I did my check on stable/10 FWIW.
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: panic "ffs_checkblk: bad block" on recent -head kernels

2015-12-04 Thread Rick Macklem
Mateusz Guzik wrote:
> On Thu, Dec 03, 2015 at 03:07:48PM -0800, Kirk McKusick wrote:
> > > Date: Thu, 3 Dec 2015 23:47:52 +0100
> > > From: Mateusz Guzik 
> > > To: Rick Macklem 
> > > Cc: FreeBSD Current 
> > > Subject: Re: panic "ffs_checkblk: bad block" on recent -head kernels
> > > 
> > > On Thu, Dec 03, 2015 at 05:08:27PM -0500, Rick Macklem wrote:
> > >> Hi,
> > >> 
> > >> I get a fairly reproducible panic when doing a full kernel build
> > >> on a 256Mbyte single core i386 when running recent kernels from -head.
> > >> 
> > >> The panic is "ffs_checkblk: bad block ..". I don't actually have the
> > >> block # (although I think it's just 0xfff, given the
> > >> backtrace),
> > >> because it runs off the screen. (I looked up the message via the
> > >> debugger
> > >> from the first arg. to panic.)
> > >> 
> > >> Here's the backtrace without all the numbers:
> > >> panic(c14f4b55, , , 0, 64,...)
> > >> ffs_checkblk(, 8000, fff9c, , c4a02454,...)
> > >> ffs_reallocblks
> > >> VOP_REALLOCBLKS_APV
> > >> cluster_write
> > >> ffs_write
> > >> VOP_WRITE_APV
> > >> vn_write
> > >> vn_io_fault_doio
> > >> vn_io_fault1
> > >> vn_io_fault
> > >> dofilewrite
> > >> kern_writev
> > >> sys_write
> > >> syscall
> > >> 
> > >> It doesn't happen on a kernel dated Sep. 30, but does happen on a Nov.
> > >> 30 one.
> > >> (I was away from home, so I didn't upgrade kernels for 2 months.)
> > >> 
> > >> I am slowly doing a binary search for the first kernel rev. where it
> > >> occurs,
> > >> but since each build takes hours, it's going to take a while;-).
> > >> 
> > >> At this point, it doesn't appear to happen on r289278 (just before
> > >> jeff@'s buffer
> > >> cache patch).
> > >> With kernels between r289279-->r290480, I get into the "R" state that
> > >> was fixed by r290481 before I get a crash.
> > >> I tried reverting r289405 and r290047 from a recent kernel and the
> > >> crashes still
> > >> occurred, so it doesn't appear to be these commits.
> > >> 
> > >> I am currently testing r290481 to see if the crash occurs for this rev.
> > >> 
> > >> If anyone has some insight into which commit might cause this,
> > >> please let me know.
> > > 
> > > Well, did it crash with r291460 or later?
> > > 
> > > If so, try the kernel just before that and if that helps, try:
> > > 
> > > diff --git a/sys/kern/vfs_subr.c b/sys/kern/vfs_subr.c
> > > index ff37de8..0ad6ef7 100644
> > > --- a/sys/kern/vfs_subr.c
> > > +++ b/sys/kern/vfs_subr.c
> > > @@ -2783,6 +2783,7 @@ _vdrop(struct vnode *vp, bool locked)
> > > vp->v_op = NULL;
> > >  #endif
> > > bzero(>v_un, sizeof(vp->v_un));
> > > +   vp->v_lasta = vp->v_clen = vp->v_cstart = vp->v_lastw = 0;
> > > vp->v_iflag = 0;
> > > vp->v_vflag = 0;
> > > bo->bo_flag = 0;
> > > 
> > > --
> > > Mateusz Guzik 
> > 
> > I concur with trying this suggestion. starting with r291460 these
> > fields were no longer zero'ed when allocating the vnode. So you may
> > have some residual values in there that are causing trouble.
> 
Good work. This does seem to fix the problem. Without this patch, the crash
would occur on pretty well every full kernel build attaempt.
With the patch, I've done 3 full kernel builds without a crash.

I see the fix has already hit head.

Thanks guys, rick

> I reviewed the rest of the structure, looks like this is the rest of the
> fallout.
> 
> --
> Mateusz Guzik 
> ___
> freebsd-current@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
> 
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: panic in arptimer in r289937

2015-12-04 Thread Hans Petter Selasky

On 12/04/15 20:34, Hans Petter Selasky wrote:

Hi Adrian,

On 10/31/15 16:01, Alexander V. Chernikov wrote:



31.10.2015, 16:46, "Adrian Chadd" :

On 31 October 2015 at 09:34, Alexander V. Chernikov
 wrote:

  31.10.2015, 05:32, "Adrian Chadd" :

  Hiya,

  Here's a panic from arptimer:

  Hi Adrian,

  As far as I see, line 205 in if_ether.c is IF_AFDATA_LOCK(ifp)
which happens after LLE_WUNLOCK().
  So, it looks like (pre-cached) ifp had been freed before locking
ifdata.
  Do you have any more details on that? (e.g. was some interface
detached at that moment, is it reproducible, etc..)

  From a quick glance, potential use-after-free has been possible
for quite a long time, but I wonder why it hasn't been observed before.
  Probably lltable_free() changes might have triggered that.

  I'll take a deeper look on that and reply.




Observed on an idle box with projects/hps_head too:


panic: bogus refcnt 0 on lle 0xf8016508ca00
cpuid = 7
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame
0xfe03e4e8c7e0
vpanic() at vpanic+0x182/frame 0xfe03e4e8c860
kassert_panic() at kassert_panic+0x126/frame 0xfe03e4e8c8d0
llentry_free() at llentry_free+0x136/frame 0xfe03e4e8c900
arptimer() at arptimer+0x20e/frame 0xfe03e4e8c950
softclock_call_cc() at softclock_call_cc+0x170/frame 0xfe03e4e8c9c0
softclock() at softclock+0x47/frame 0xfe03e4e8c9e0
intr_event_execute_handlers() at
intr_event_execute_handlers+0x96/frame 0xfe03e4e8ca20
ithread_loop() at ithread_loop+0xa6/frame 0xfe03e4e8ca70
fork_exit() at fork_exit+0x84/frame 0xfe03e4e8cab0
fork_trampoline() at fork_trampoline+0xe/frame 0xfe03e4e8cab0
--- trap 0, rip = 0, rsp = 0, rbp = 0 ---


Looks like callout_reset() must be examined too, and was missed by:

https://svnweb.freebsd.org/changeset/base/290805

Can you try the attached patch?

Randall: Can you fix this ASAP?

--HPS



Hi,

Randall:

I see for 11-current, callout_reset() doesn't return -1 when the callout 
is stopped like with callout_stop(). Is this a bug or a feature? Why 
can't the callout_reset() and callout_stop() functions use the same 
return values?


In nd6_llinfo_settimer_locked() the return value of both callout_reset() 
and callout_stop() is checked for positive values, but not in the other 
places mentioned by my patch.


--HPS
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: FYI: SVN to GIT converter currently broken, github is falling behind

2015-12-04 Thread Bryan Drewery
On 12/4/2015 10:49 AM, Ulrich Spörlein wrote:
> 2015-11-08 12:06 GMT+01:00 Ulrich Spörlein :
>> 2015-11-08 11:32 GMT+01:00 Ulrich Spörlein :
>>> 2015-11-08 2:51 GMT+01:00 Alfred Perlstein :
>
 Uli,

 One of the biggest concerns I've heard from folks using FreeBSD's git 
 mirror
 is that the hashes can change.

 I have a question about this.   Is it possible to keep track of what the
 "official" git mirror (on github) is doing and keep that as a log.  Then
 that log can be used to replay commits when there is a divergence problem.

 What I'm basically saying is that let's take this small example:

 importer is working fine @rev 1
 imports 1
 imports 10001
 imports 10002
 something happens to importer to give indeterminate shas.
 imports 10003 - sha is "unstable" sha3
 imports 10004 - sha is "unstable" sha4
 imports 10005 - sha is "unstable" sha5
 imports 10006 - sha is "unstable" sha6
 importer is fixed


 At this point normally we'd rewind the importer to 10002 and then force
 update the affected branches.

 My question is... can the imports of 10003, 10004, 10005 and 10006 be put
 into the importer such that any "mirror site" that re-does the import using
 the most up to date importer will get the same shas.

 That would allow to proceed with 10007, etc without force pushing.

 This should be possible based on querying "git" for the meta data 
 associated
 with sha3..sha6 and then forcing those commits to have the same meta data.

 This would eliminate the concern about shas in the mirror changing that 
 I've
 heard.
>>>
>>> The goal of the conversion is that everyone can re-do the conversion
>>> in their basement and come up with the same history and checksums.
>>> This was not the case when I first started, as there was some
>>> non-deterministic hash structure being used in svn2git. This was fixed
>>> in the code and then all converter runs produced the very same
>>> results.
>>>
>>> The scenario that we have right now, is that one of the merge commits
>>> done about two weeks ago is being handled different by svn2git w/ svn
>>> v1.8 vs. svn v1.9 and I haven't investigated yet how the API's
>>> behavior changed to cause this. I'm afraid I also swapped out all my
>>> knowledge about svn2git internals and will have to redo this all from
>>> scratch :/
>>>
>>> Your suggestion could only work, if we hard-code this svn revision
>>> special handling into svn2git, either in the code or by providing more
>>> mappings and rules to the process. svn2git should run hermetic and not
>>> poke at github's commits to see how things were handled in the past.
>>> It has to be self-sufficient and must not depend on github.
>>>
>>> This would also only work, if the "breakage" window was very small,
>>> but it is already about two weeks long and will surely increase till I
>>> find the proper fix.
>>>
>>> So, to take a stand here: this sort of kludge is unlikely to ever
>>> happen. Git commit hashes *might* change in the future. I really don't
>>> see how this is a big deal anyway.  It happened once and I'm trying to
>>> have it never happen again. But why are people afraid of this
>>> happening? Every "official" git commit is tagged with a SVN revision
>>> and the contents of those revisions are obviously correct (just not
>>> the ancestry and the commit objects, possibly). So it would be easy to
>>> write a script that replays VendorA's git history and swaps out the
>>> new official commits for the old official commits. There would be no
>>> merge conflicts.
>>>
>>> I can see how this would be annoying if you have 100 developers and
>>> dozens of branches that are far from mainline FreeBSD. But I'm sure
>>> these companies that depend on git will come forward and donate some
>>> of their developer manpower to help me with keeping the converter
>>> stable/deterministic. Right? Right? :) :)
>>>
>>> Cheers,
>>> Uli
>>
>> Quick update: doc is so far unaffected by svn 1.9, but for ports, the
>> drift happened as of Jul 18, so you'd need to special case a lot of
>> commits.
>>
>> Here's the same commit, and the difference between 1.8 and 1.9:
>>
>> % git cat-file commit 803795d
>> tree 7fc83aba022834da5c218114b09ad4640735bcc0
>> parent c96fb0418e545a569b5975b4d878a30a948c29d5
>> author olgeni  1437203525 +
>> committer olgeni  1437203525 +
>>
>> Upgrade to version 0.4.1.
>> % git cat-file commit 61ca43b
>> tree 7fc83aba022834da5c218114b09ad4640735bcc0
>> parent c96fb0418e545a569b5975b4d878a30a948c29d5
>> author olgeni  1437203529 +
>> committer olgeni  1437203529 +
>>
>> Upgrade to version 0.4.1.
>>
>>
>> In case you don't see it, there's a 4s difference in the timestamps
>> for authoring and committing. 

Re: FYI: SVN to GIT converter currently broken, github is falling behind

2015-12-04 Thread Ulrich Spörlein
2015-11-08 12:06 GMT+01:00 Ulrich Spörlein :
> 2015-11-08 11:32 GMT+01:00 Ulrich Spörlein :
>> 2015-11-08 2:51 GMT+01:00 Alfred Perlstein :

>>> Uli,
>>>
>>> One of the biggest concerns I've heard from folks using FreeBSD's git mirror
>>> is that the hashes can change.
>>>
>>> I have a question about this.   Is it possible to keep track of what the
>>> "official" git mirror (on github) is doing and keep that as a log.  Then
>>> that log can be used to replay commits when there is a divergence problem.
>>>
>>> What I'm basically saying is that let's take this small example:
>>>
>>> importer is working fine @rev 1
>>> imports 1
>>> imports 10001
>>> imports 10002
>>> something happens to importer to give indeterminate shas.
>>> imports 10003 - sha is "unstable" sha3
>>> imports 10004 - sha is "unstable" sha4
>>> imports 10005 - sha is "unstable" sha5
>>> imports 10006 - sha is "unstable" sha6
>>> importer is fixed
>>>
>>>
>>> At this point normally we'd rewind the importer to 10002 and then force
>>> update the affected branches.
>>>
>>> My question is... can the imports of 10003, 10004, 10005 and 10006 be put
>>> into the importer such that any "mirror site" that re-does the import using
>>> the most up to date importer will get the same shas.
>>>
>>> That would allow to proceed with 10007, etc without force pushing.
>>>
>>> This should be possible based on querying "git" for the meta data associated
>>> with sha3..sha6 and then forcing those commits to have the same meta data.
>>>
>>> This would eliminate the concern about shas in the mirror changing that I've
>>> heard.
>>
>> The goal of the conversion is that everyone can re-do the conversion
>> in their basement and come up with the same history and checksums.
>> This was not the case when I first started, as there was some
>> non-deterministic hash structure being used in svn2git. This was fixed
>> in the code and then all converter runs produced the very same
>> results.
>>
>> The scenario that we have right now, is that one of the merge commits
>> done about two weeks ago is being handled different by svn2git w/ svn
>> v1.8 vs. svn v1.9 and I haven't investigated yet how the API's
>> behavior changed to cause this. I'm afraid I also swapped out all my
>> knowledge about svn2git internals and will have to redo this all from
>> scratch :/
>>
>> Your suggestion could only work, if we hard-code this svn revision
>> special handling into svn2git, either in the code or by providing more
>> mappings and rules to the process. svn2git should run hermetic and not
>> poke at github's commits to see how things were handled in the past.
>> It has to be self-sufficient and must not depend on github.
>>
>> This would also only work, if the "breakage" window was very small,
>> but it is already about two weeks long and will surely increase till I
>> find the proper fix.
>>
>> So, to take a stand here: this sort of kludge is unlikely to ever
>> happen. Git commit hashes *might* change in the future. I really don't
>> see how this is a big deal anyway.  It happened once and I'm trying to
>> have it never happen again. But why are people afraid of this
>> happening? Every "official" git commit is tagged with a SVN revision
>> and the contents of those revisions are obviously correct (just not
>> the ancestry and the commit objects, possibly). So it would be easy to
>> write a script that replays VendorA's git history and swaps out the
>> new official commits for the old official commits. There would be no
>> merge conflicts.
>>
>> I can see how this would be annoying if you have 100 developers and
>> dozens of branches that are far from mainline FreeBSD. But I'm sure
>> these companies that depend on git will come forward and donate some
>> of their developer manpower to help me with keeping the converter
>> stable/deterministic. Right? Right? :) :)
>>
>> Cheers,
>> Uli
>
> Quick update: doc is so far unaffected by svn 1.9, but for ports, the
> drift happened as of Jul 18, so you'd need to special case a lot of
> commits.
>
> Here's the same commit, and the difference between 1.8 and 1.9:
>
> % git cat-file commit 803795d
> tree 7fc83aba022834da5c218114b09ad4640735bcc0
> parent c96fb0418e545a569b5975b4d878a30a948c29d5
> author olgeni  1437203525 +
> committer olgeni  1437203525 +
>
> Upgrade to version 0.4.1.
> % git cat-file commit 61ca43b
> tree 7fc83aba022834da5c218114b09ad4640735bcc0
> parent c96fb0418e545a569b5975b4d878a30a948c29d5
> author olgeni  1437203529 +
> committer olgeni  1437203529 +
>
> Upgrade to version 0.4.1.
>
>
> In case you don't see it, there's a 4s difference in the timestamps
> for authoring and committing. Here's the original:
>
> % svn log -vc392405 svn://svn.freebsd.org/ports
> 
> r392405 | olgeni | 

Re: panic in arptimer in r289937

2015-12-04 Thread Hans Petter Selasky

Hi Adrian,

On 10/31/15 16:01, Alexander V. Chernikov wrote:



31.10.2015, 16:46, "Adrian Chadd" :

On 31 October 2015 at 09:34, Alexander V. Chernikov
 wrote:

  31.10.2015, 05:32, "Adrian Chadd" :

  Hiya,

  Here's a panic from arptimer:

  Hi Adrian,

  As far as I see, line 205 in if_ether.c is IF_AFDATA_LOCK(ifp) which happens 
after LLE_WUNLOCK().
  So, it looks like (pre-cached) ifp had been freed before locking ifdata.
  Do you have any more details on that? (e.g. was some interface detached at 
that moment, is it reproducible, etc..)

  From a quick glance, potential use-after-free has been possible for quite a 
long time, but I wonder why it hasn't been observed before.
  Probably lltable_free() changes might have triggered that.

  I'll take a deeper look on that and reply.




Observed on an idle box with projects/hps_head too:


panic: bogus refcnt 0 on lle 0xf8016508ca00
cpuid = 7
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfe03e4e8c7e0
vpanic() at vpanic+0x182/frame 0xfe03e4e8c860
kassert_panic() at kassert_panic+0x126/frame 0xfe03e4e8c8d0
llentry_free() at llentry_free+0x136/frame 0xfe03e4e8c900
arptimer() at arptimer+0x20e/frame 0xfe03e4e8c950
softclock_call_cc() at softclock_call_cc+0x170/frame 0xfe03e4e8c9c0
softclock() at softclock+0x47/frame 0xfe03e4e8c9e0
intr_event_execute_handlers() at intr_event_execute_handlers+0x96/frame 
0xfe03e4e8ca20
ithread_loop() at ithread_loop+0xa6/frame 0xfe03e4e8ca70
fork_exit() at fork_exit+0x84/frame 0xfe03e4e8cab0
fork_trampoline() at fork_trampoline+0xe/frame 0xfe03e4e8cab0
--- trap 0, rip = 0, rsp = 0, rbp = 0 ---


Looks like callout_reset() must be examined too, and was missed by:

https://svnweb.freebsd.org/changeset/base/290805

Can you try the attached patch?

Randall: Can you fix this ASAP?

--HPS
diff --git a/sys/dev/oce/oce_if.c b/sys/dev/oce/oce_if.c
index 826cd3c..1cca876 100644
--- a/sys/dev/oce/oce_if.c
+++ b/sys/dev/oce/oce_if.c
@@ -343,7 +343,7 @@ oce_attach(device_t dev)
 
 	callout_init(>timer, 1);
 	rc = callout_reset(>timer, 2 * hz, oce_local_timer, sc);
-	if (rc)
+	if (rc > 0)
 		goto stats_free;
 
 	return 0;
diff --git a/sys/netinet/if_ether.c b/sys/netinet/if_ether.c
index dfba0b2..0aec1c4 100644
--- a/sys/netinet/if_ether.c
+++ b/sys/netinet/if_ether.c
@@ -420,7 +420,7 @@ arpresolve_full(struct ifnet *ifp, int is_gw, int create, struct mbuf *m,
 		la->la_expire = time_uptime;
 		canceled = callout_reset(>lle_timer, hz * V_arpt_down,
 		arptimer, la);
-		if (canceled)
+		if (canceled > 0)
 			LLE_REMREF(la);
 		la->la_asked++;
 		LLE_WUNLOCK(la);
@@ -1084,7 +1084,7 @@ arp_mark_lle_reachable(struct llentry *la)
 		la->la_expire = time_uptime + V_arpt_keep;
 		canceled = callout_reset(>lle_timer,
 		hz * V_arpt_keep, arptimer, la);
-		if (canceled)
+		if (canceled > 0)
 			LLE_REMREF(la);
 	}
 	la->la_asked = 0;
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"