Re: [CURRENT]: weird memory/linker problem?

2014-07-17 Thread O. Hartmann
Am Tue, 01 Jul 2014 17:23:14 +0200
Willem Jan Withagen  schrieb:

> On 2014-07-01 16:48, Rang, Anton wrote:
> > DOT => DOD
> >
> > 444F54 => 444F44
> >
> > That's a single-bit flip.  Bad memory, perhaps?
> 
> Very likely, especially if the system does not have ECC
> It just happens on rare occasions that a alpha particle, power cycle, or 
> any things else disruptive damages a memory cell. And it could be that 
> it requires a special pattern of accesses to actually exhibit the error.
> 
> In the past (199x's) 'make buildworld' used to be a rather good memory 
> tester. But nowadays look at
>   http://www.memtest.org/
> 
> This tool has found all of the bad memory in all the systems I used and 
> or build for others...
> Note that it might take a few runs and some more heat to actually 
> trigger the faulty cell, but memtest86 will usually find it.
> 
> Note that on big systems with lots of memory it can take a loong 
> time to run just one full testset to completion.
> 
> --WjW
> 
> 
> >
> > Anton
> >
> > -Original Message-
> > From: owner-freebsd-curr...@freebsd.org 
> > [mailto:owner-freebsd-curr...@freebsd.org] On
> > Behalf Of O. Hartmann Sent: Tuesday, July 01, 2014 8:08 AM
> > To: Dimitry Andric
> > Cc: Adrian Chadd; FreeBSD CURRENT
> > Subject: Re: [CURRENT]: weird memory/linker problem?
> >
> > Am Mon, 23 Jun 2014 17:22:25 +0200
> > Dimitry Andric  schrieb:
> >
> >> On 23 Jun 2014, at 16:31, O. Hartmann  wrote:
> >>> Am Sun, 22 Jun 2014 10:10:04 -0700
> >>> Adrian Chadd  schrieb:
> >>>> When they segfault, where do they segfault?
> >> ...
> >>> GIMP, LaTeX work, nothing special, but a bit memory consuming
> >>> regrading GIMP) I tried updating the ports tree and surprisingly the
> >>> tree is left over in a unclean condition while /usr/bin/svn segfault
> >>> (on console: pid 18013 (svn), uid 0: exited on signal 11 (core dumped)).
> >>>
> >>> Using /usr/local/bin/svn, which is from the devel/subversion port,
> >>> performs well, while FreeBSD 11's svn contribution dies as described. It 
> >>> did not
> >>> hours ago!
> >>
> >> I think what Adrian meant was: can you run svn (or another crashing
> >> program) in gdb, and post a backtrace?  Or maybe run ktrace, and see
> >> where it dies?
> >>
> >> Alternatively, put a core dump and the executable (with debug info) in
> >> a tarball, and upload it somewhere, so somebody else can analyze it.
> >>
> >> -Dimitry
> >>
> >
> > It's me again, with the same weird story.
> >
> > After a couple of days silence, the mysterious entity in my computer is 
> > back. This
> > time it is again a weird compiler message of failure (trying to buildworld):
> >
> > [...]
> > c++  -O2 -pipe -O3 -O3
> > c++ -I/usr/src/lib/clang/libllvmsupport/../../../contrib/llvm/include
> > -I/usr/src/lib/clang/libllvmsupport/../../../contrib/llvm/tools/clang/include
> > -I/usr/src/lib/clang/libllvmsupport/../../../contrib/llvm/lib/Support -I.
> > -I/usr/src/lib/clang/libllvmsupport/../../../contrib/llvm/../../lib/clang/include
> > -DLLVM_ON_UNIX -DLLVM_ON_FREEBSD -D__STDC_LIMIT_MACROS 
> > -D__STDC_CONSTANT_MACROS
> > -fno-strict-aliasing 
> > -DLLVM_DEFAULT_TARGET_TRIPLE=\"x86_64-unknown-freebsd11.0\"
> > -DLLVM_HOST_TRIPLE=\"x86_64-unknown-freebsd11.0\" -DDEFAULT_SYSROOT=\"\"
> > -Qunused-arguments -I/usr/obj/usr/src/tmp/legacy/usr/include -std=c++11
> > -fno-exceptions -fno-rtti -Wno-c++11-extensions
> > -c 
> > /usr/src/lib/clang/libllvmsupport/../../../contrib/llvm/lib/Support/Host.cpp
> >  -o
> > Host.o --- GraphWriter.o --- In file included
> > from 
> > /usr/src/lib/clang/libllvmsupport/../../../contrib/llvm/lib/Support/GraphWriter.cpp:14:
> >  
> > /usr/src/lib/clang/libllvmsupport/../../../contrib/llvm/include/llvm/Support/GraphWriter.h:269:10:
> > error: use of undeclared identifier 'DOD'; did you mean 'DOT'? O <<
> > DOD::EscapeString(Label); ^~~
> > DOT 
> > /usr/src/lib/clang/libllvmsupport/../../../contrib/llvm/include/llvm/Support/GraphWriter.h:35:11:
> > note: 'DOT' declared here namespace DOT {  // Private functions... ^ 1 error
> > generated. *** [GraphWriter.o] Error code 1
> >
> >
> > Well, in the past I saw many of those messages, especially not found labels 
> > of
> > routines in s

Re: [CURRENT]: weird memory/linker problem?

2014-07-01 Thread O. Hartmann
Am Tue, 01 Jul 2014 17:57:26 +0200
Willem Jan Withagen  schrieb:

> On 2014-07-01 17:33, O. Hartmann wrote:
> > Am Tue, 01 Jul 2014 17:23:14 +0200
> > Willem Jan Withagen  schrieb:
> >
> >> On 2014-07-01 16:48, Rang, Anton wrote:
> >>> DOT => DOD
> >>>
> >>> 444F54 => 444F44
> >>>
> >>> That's a single-bit flip.  Bad memory, perhaps?
> >>
> >> Very likely, especially if the system does not have ECC
> >> It just happens on rare occasions that a alpha particle, power cycle, or
> >> any things else disruptive damages a memory cell. And it could be that
> >> it requires a special pattern of accesses to actually exhibit the error.
> >>
> >> In the past (199x's) 'make buildworld' used to be a rather good memory
> >> tester. But nowadays look at
> >>http://www.memtest.org/
> >>
> >> This tool has found all of the bad memory in all the systems I used and
> >> or build for others...
> >> Note that it might take a few runs and some more heat to actually
> >> trigger the faulty cell, but memtest86 will usually find it.
> >>
> >> Note that on big systems with lots of memory it can take a loong
> >> time to run just one full testset to completion.
> >>
> >> --WjW
> >
> > I already testet via memtest86+ (had to download the linux image, the port 
> > on FreeBSD
> > is broken on CURRENT). It didn't find anything strange so far.
> >
> > I will do another test.
> >
> > I realised, that on that that specific box, the chipset temperature is 81 
> > Grad Celius.
> > The chipset is a Eaglelake P45 - in which the memory controller resides on 
> > that old
> > platform. dmidecode gives:
> >
> >  Manufacturer: ASUSTeK Computer INC.
> >  Product Name: P5Q-WS
> >  Version: Rev 1.xx
>


Hello Willem,

 
> Hi Oliver,
> 
> I've build several (5+) systems with these boards (from memory they date 
> around 2009??). And if I recall right, one of them is still functional. 
> The first one broke down in a couple of weeks, and the other did not 
> survive time either.
> 
> The auxiliary chips on that board do run hot, but I never realized this 
> hot. Is 81C is the CPU temp from sysctl, or did you measure the cooling 
> body on the motherboard. In the later case it is just too hot, probably.
> But even if it is the temp on the chip itself, I've rrarely seen temps 
> go up this high.

The temperature is seen in BIOS and by the usage of one of those health daemon, 
found in
ports (forgot about the name). 
There is no sysctl MIB showing the chipset temperature on that board, as far as 
I know.

> 
> You can need to run the memtest86 for more than 6-10 complete runs with 
> all the tests.

Last time I ran memtest86+ it took ~ 1 1/2 days to finish.

> 
> If the memtests do not reveal anything broken, then you get into even 
> more wizardry stuff, like bad power etc... Especially since it only 
> occurs on occasion, it is going to be a nightmare to find the root cause 
> of this. Other than replacing hardware piece by piece, which won't be 
> easy given the age of the board and parts.
> 
> You could go into the bios, and try to config ram access at a slower 
> speed and see if the problem goes away. Then it could be that you are 
> running an the edge of the spec with regards to ram timing.
> 
> But like I said, it is all lots of funky details that can interact in 
> strange and unexpected ways.
> 
> --WjW

I will check memory these days again.

Regards,
Oliver



signature.asc
Description: PGP signature


Re: [CURRENT]: weird memory/linker problem?

2014-07-01 Thread Willem Jan Withagen

On 2014-07-01 17:33, O. Hartmann wrote:

Am Tue, 01 Jul 2014 17:23:14 +0200
Willem Jan Withagen  schrieb:


On 2014-07-01 16:48, Rang, Anton wrote:

DOT => DOD

444F54 => 444F44

That's a single-bit flip.  Bad memory, perhaps?


Very likely, especially if the system does not have ECC
It just happens on rare occasions that a alpha particle, power cycle, or
any things else disruptive damages a memory cell. And it could be that
it requires a special pattern of accesses to actually exhibit the error.

In the past (199x's) 'make buildworld' used to be a rather good memory
tester. But nowadays look at
http://www.memtest.org/

This tool has found all of the bad memory in all the systems I used and
or build for others...
Note that it might take a few runs and some more heat to actually
trigger the faulty cell, but memtest86 will usually find it.

Note that on big systems with lots of memory it can take a loong
time to run just one full testset to completion.

--WjW


I already testet via memtest86+ (had to download the linux image, the port on 
FreeBSD is
broken on CURRENT). It didn't find anything strange so far.

I will do another test.

I realised, that on that that specific box, the chipset temperature is 81 Grad 
Celius.
The chipset is a Eaglelake P45 - in which the memory controller resides on that 
old
platform. dmidecode gives:

 Manufacturer: ASUSTeK Computer INC.
 Product Name: P5Q-WS
 Version: Rev 1.xx


Hi Oliver,

I've build several (5+) systems with these boards (from memory they date 
around 2009??). And if I recall right, one of them is still functional. 
The first one broke down in a couple of weeks, and the other did not 
survive time either.


The auxiliary chips on that board do run hot, but I never realized this 
hot. Is 81C is the CPU temp from sysctl, or did you measure the cooling 
body on the motherboard. In the later case it is just too hot, probably.
But even if it is the temp on the chip itself, I've rrarely seen temps 
go up this high.


You can need to run the memtest86 for more than 6-10 complete runs with 
all the tests.


If the memtests do not reveal anything broken, then you get into even 
more wizardry stuff, like bad power etc... Especially since it only 
occurs on occasion, it is going to be a nightmare to find the root cause 
of this. Other than replacing hardware piece by piece, which won't be 
easy given the age of the board and parts.


You could go into the bios, and try to config ram access at a slower 
speed and see if the problem goes away. Then it could be that you are 
running an the edge of the spec with regards to ram timing.


But like I said, it is all lots of funky details that can interact in 
strange and unexpected ways.


--WjW

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: [CURRENT]: weird memory/linker problem?

2014-07-01 Thread O. Hartmann
Am Tue, 01 Jul 2014 17:23:14 +0200
Willem Jan Withagen  schrieb:

> On 2014-07-01 16:48, Rang, Anton wrote:
> > DOT => DOD
> >
> > 444F54 => 444F44
> >
> > That's a single-bit flip.  Bad memory, perhaps?
> 
> Very likely, especially if the system does not have ECC
> It just happens on rare occasions that a alpha particle, power cycle, or 
> any things else disruptive damages a memory cell. And it could be that 
> it requires a special pattern of accesses to actually exhibit the error.
> 
> In the past (199x's) 'make buildworld' used to be a rather good memory 
> tester. But nowadays look at
>   http://www.memtest.org/
> 
> This tool has found all of the bad memory in all the systems I used and 
> or build for others...
> Note that it might take a few runs and some more heat to actually 
> trigger the faulty cell, but memtest86 will usually find it.
> 
> Note that on big systems with lots of memory it can take a loong 
> time to run just one full testset to completion.
> 
> --WjW

I already testet via memtest86+ (had to download the linux image, the port on 
FreeBSD is
broken on CURRENT). It didn't find anything strange so far.

I will do another test.

I realised, that on that that specific box, the chipset temperature is 81 Grad 
Celius.
The chipset is a Eaglelake P45 - in which the memory controller resides on that 
old
platform. dmidecode gives:

Manufacturer: ASUSTeK Computer INC.
Product Name: P5Q-WS
Version: Rev 1.xx

> 
> 
> >
> > Anton
> >
> > -Original Message-
> > From: owner-freebsd-curr...@freebsd.org 
> > [mailto:owner-freebsd-curr...@freebsd.org] On
> > Behalf Of O. Hartmann Sent: Tuesday, July 01, 2014 8:08 AM
> > To: Dimitry Andric
> > Cc: Adrian Chadd; FreeBSD CURRENT
> > Subject: Re: [CURRENT]: weird memory/linker problem?
> >
> > Am Mon, 23 Jun 2014 17:22:25 +0200
> > Dimitry Andric  schrieb:
> >
> >> On 23 Jun 2014, at 16:31, O. Hartmann  wrote:
> >>> Am Sun, 22 Jun 2014 10:10:04 -0700
> >>> Adrian Chadd  schrieb:
> >>>> When they segfault, where do they segfault?
> >> ...
> >>> GIMP, LaTeX work, nothing special, but a bit memory consuming
> >>> regrading GIMP) I tried updating the ports tree and surprisingly the
> >>> tree is left over in a unclean condition while /usr/bin/svn segfault
> >>> (on console: pid 18013 (svn), uid 0: exited on signal 11 (core dumped)).
> >>>
> >>> Using /usr/local/bin/svn, which is from the devel/subversion port,
> >>> performs well, while FreeBSD 11's svn contribution dies as described. It 
> >>> did not
> >>> hours ago!
> >>
> >> I think what Adrian meant was: can you run svn (or another crashing
> >> program) in gdb, and post a backtrace?  Or maybe run ktrace, and see
> >> where it dies?
> >>
> >> Alternatively, put a core dump and the executable (with debug info) in
> >> a tarball, and upload it somewhere, so somebody else can analyze it.
> >>
> >> -Dimitry
> >>
> >
> > It's me again, with the same weird story.
> >
> > After a couple of days silence, the mysterious entity in my computer is 
> > back. This
> > time it is again a weird compiler message of failure (trying to buildworld):
> >
> > [...]
> > c++  -O2 -pipe -O3 -O3
> > c++ -I/usr/src/lib/clang/libllvmsupport/../../../contrib/llvm/include
> > -I/usr/src/lib/clang/libllvmsupport/../../../contrib/llvm/tools/clang/include
> > -I/usr/src/lib/clang/libllvmsupport/../../../contrib/llvm/lib/Support -I.
> > -I/usr/src/lib/clang/libllvmsupport/../../../contrib/llvm/../../lib/clang/include
> > -DLLVM_ON_UNIX -DLLVM_ON_FREEBSD -D__STDC_LIMIT_MACROS 
> > -D__STDC_CONSTANT_MACROS
> > -fno-strict-aliasing 
> > -DLLVM_DEFAULT_TARGET_TRIPLE=\"x86_64-unknown-freebsd11.0\"
> > -DLLVM_HOST_TRIPLE=\"x86_64-unknown-freebsd11.0\" -DDEFAULT_SYSROOT=\"\"
> > -Qunused-arguments -I/usr/obj/usr/src/tmp/legacy/usr/include -std=c++11
> > -fno-exceptions -fno-rtti -Wno-c++11-extensions
> > -c 
> > /usr/src/lib/clang/libllvmsupport/../../../contrib/llvm/lib/Support/Host.cpp
> >  -o
> > Host.o --- GraphWriter.o --- In file included
> > from 
> > /usr/src/lib/clang/libllvmsupport/../../../contrib/llvm/lib/Support/GraphWriter.cpp:14:
> >  
> > /usr/src/lib/clang/libllvmsupport/../../../contrib/llvm/include/llvm/Support/GraphWriter.h:269:10:
> > error: use of undeclared identifier 'DOD'; did you mean '

Re: [CURRENT]: weird memory/linker problem?

2014-07-01 Thread Willem Jan Withagen

On 2014-07-01 16:48, Rang, Anton wrote:

DOT => DOD

444F54 => 444F44

That's a single-bit flip.  Bad memory, perhaps?


Very likely, especially if the system does not have ECC
It just happens on rare occasions that a alpha particle, power cycle, or 
any things else disruptive damages a memory cell. And it could be that 
it requires a special pattern of accesses to actually exhibit the error.


In the past (199x's) 'make buildworld' used to be a rather good memory 
tester. But nowadays look at

http://www.memtest.org/

This tool has found all of the bad memory in all the systems I used and 
or build for others...
Note that it might take a few runs and some more heat to actually 
trigger the faulty cell, but memtest86 will usually find it.


Note that on big systems with lots of memory it can take a loong 
time to run just one full testset to completion.


--WjW




Anton

-Original Message-
From: owner-freebsd-curr...@freebsd.org 
[mailto:owner-freebsd-curr...@freebsd.org] On Behalf Of O. Hartmann
Sent: Tuesday, July 01, 2014 8:08 AM
To: Dimitry Andric
Cc: Adrian Chadd; FreeBSD CURRENT
Subject: Re: [CURRENT]: weird memory/linker problem?

Am Mon, 23 Jun 2014 17:22:25 +0200
Dimitry Andric  schrieb:


On 23 Jun 2014, at 16:31, O. Hartmann  wrote:

Am Sun, 22 Jun 2014 10:10:04 -0700
Adrian Chadd  schrieb:

When they segfault, where do they segfault?

...

GIMP, LaTeX work, nothing special, but a bit memory consuming
regrading GIMP) I tried updating the ports tree and surprisingly the
tree is left over in a unclean condition while /usr/bin/svn segfault
(on console: pid 18013 (svn), uid 0: exited on signal 11 (core dumped)).

Using /usr/local/bin/svn, which is from the devel/subversion port,
performs well, while FreeBSD 11's svn contribution dies as described. It did 
not hours ago!


I think what Adrian meant was: can you run svn (or another crashing
program) in gdb, and post a backtrace?  Or maybe run ktrace, and see
where it dies?

Alternatively, put a core dump and the executable (with debug info) in
a tarball, and upload it somewhere, so somebody else can analyze it.

-Dimitry



It's me again, with the same weird story.

After a couple of days silence, the mysterious entity in my computer is back. 
This time it is again a weird compiler message of failure (trying to 
buildworld):

[...]
c++  -O2 -pipe -O3 -O3
c++ -I/usr/src/lib/clang/libllvmsupport/../../../contrib/llvm/include
-I/usr/src/lib/clang/libllvmsupport/../../../contrib/llvm/tools/clang/include
-I/usr/src/lib/clang/libllvmsupport/../../../contrib/llvm/lib/Support -I.
-I/usr/src/lib/clang/libllvmsupport/../../../contrib/llvm/../../lib/clang/include
-DLLVM_ON_UNIX -DLLVM_ON_FREEBSD -D__STDC_LIMIT_MACROS -D__STDC_CONSTANT_MACROS 
-fno-strict-aliasing -DLLVM_DEFAULT_TARGET_TRIPLE=\"x86_64-unknown-freebsd11.0\"
-DLLVM_HOST_TRIPLE=\"x86_64-unknown-freebsd11.0\" -DDEFAULT_SYSROOT=\"\"
-Qunused-arguments -I/usr/obj/usr/src/tmp/legacy/usr/include -std=c++11 
-fno-exceptions -fno-rtti -Wno-c++11-extensions -c 
/usr/src/lib/clang/libllvmsupport/../../../contrib/llvm/lib/Support/Host.cpp -o 
Host.o
--- GraphWriter.o --- In file included
from 
/usr/src/lib/clang/libllvmsupport/../../../contrib/llvm/lib/Support/GraphWriter.cpp:14:
 
/usr/src/lib/clang/libllvmsupport/../../../contrib/llvm/include/llvm/Support/GraphWriter.h:269:10:
error: use of undeclared identifier 'DOD'; did you mean 'DOT'? O << 
DOD::EscapeString(Label); ^~~ DOT 
/usr/src/lib/clang/libllvmsupport/../../../contrib/llvm/include/llvm/Support/GraphWriter.h:35:11:
note: 'DOT' declared here namespace DOT {  // Private functions... ^ 1 error 
generated.
*** [GraphWriter.o] Error code 1


Well, in the past I saw many of those messages, especially not found labels of routines 
in shared objects/libraries or even those "funny" misspelled messages shown 
above.

I can not reproduce them after a reboot, but as long as the system is running 
with this error occured, it is sticky. So in order to compile the OS 
successfully, I reboot.

Does anyone have an idea what this could be? Since it affects at the moment 
only one machine (the other CoreDuo has been retired in the meanwhile), it 
feels a bit like a miscompilation on a certain type of CPU.

Thanks for your patience,

Oliver
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"



___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


RE: [CURRENT]: weird memory/linker problem?

2014-07-01 Thread Rang, Anton
DOT => DOD

444F54 => 444F44

That's a single-bit flip.  Bad memory, perhaps?

Anton

-Original Message-
From: owner-freebsd-curr...@freebsd.org 
[mailto:owner-freebsd-curr...@freebsd.org] On Behalf Of O. Hartmann
Sent: Tuesday, July 01, 2014 8:08 AM
To: Dimitry Andric
Cc: Adrian Chadd; FreeBSD CURRENT
Subject: Re: [CURRENT]: weird memory/linker problem?

Am Mon, 23 Jun 2014 17:22:25 +0200
Dimitry Andric  schrieb:

> On 23 Jun 2014, at 16:31, O. Hartmann  wrote:
> > Am Sun, 22 Jun 2014 10:10:04 -0700
> > Adrian Chadd  schrieb:
> >> When they segfault, where do they segfault?
> ...
> > GIMP, LaTeX work, nothing special, but a bit memory consuming 
> > regrading GIMP) I tried updating the ports tree and surprisingly the 
> > tree is left over in a unclean condition while /usr/bin/svn segfault 
> > (on console: pid 18013 (svn), uid 0: exited on signal 11 (core dumped)).
> > 
> > Using /usr/local/bin/svn, which is from the devel/subversion port, 
> > performs well, while FreeBSD 11's svn contribution dies as described. It 
> > did not hours ago!
> 
> I think what Adrian meant was: can you run svn (or another crashing
> program) in gdb, and post a backtrace?  Or maybe run ktrace, and see 
> where it dies?
> 
> Alternatively, put a core dump and the executable (with debug info) in 
> a tarball, and upload it somewhere, so somebody else can analyze it.
> 
> -Dimitry
> 

It's me again, with the same weird story.

After a couple of days silence, the mysterious entity in my computer is back. 
This time it is again a weird compiler message of failure (trying to 
buildworld):

[...]
c++  -O2 -pipe -O3 -O3 
c++ -I/usr/src/lib/clang/libllvmsupport/../../../contrib/llvm/include
-I/usr/src/lib/clang/libllvmsupport/../../../contrib/llvm/tools/clang/include
-I/usr/src/lib/clang/libllvmsupport/../../../contrib/llvm/lib/Support -I.
-I/usr/src/lib/clang/libllvmsupport/../../../contrib/llvm/../../lib/clang/include
-DLLVM_ON_UNIX -DLLVM_ON_FREEBSD -D__STDC_LIMIT_MACROS -D__STDC_CONSTANT_MACROS 
-fno-strict-aliasing -DLLVM_DEFAULT_TARGET_TRIPLE=\"x86_64-unknown-freebsd11.0\"
-DLLVM_HOST_TRIPLE=\"x86_64-unknown-freebsd11.0\" -DDEFAULT_SYSROOT=\"\"
-Qunused-arguments -I/usr/obj/usr/src/tmp/legacy/usr/include -std=c++11 
-fno-exceptions -fno-rtti -Wno-c++11-extensions -c 
/usr/src/lib/clang/libllvmsupport/../../../contrib/llvm/lib/Support/Host.cpp -o 
Host.o
--- GraphWriter.o --- In file included
from 
/usr/src/lib/clang/libllvmsupport/../../../contrib/llvm/lib/Support/GraphWriter.cpp:14:
 
/usr/src/lib/clang/libllvmsupport/../../../contrib/llvm/include/llvm/Support/GraphWriter.h:269:10:
error: use of undeclared identifier 'DOD'; did you mean 'DOT'? O << 
DOD::EscapeString(Label); ^~~ DOT 
/usr/src/lib/clang/libllvmsupport/../../../contrib/llvm/include/llvm/Support/GraphWriter.h:35:11:
note: 'DOT' declared here namespace DOT {  // Private functions... ^ 1 error 
generated.
*** [GraphWriter.o] Error code 1


Well, in the past I saw many of those messages, especially not found labels of 
routines in shared objects/libraries or even those "funny" misspelled messages 
shown above.

I can not reproduce them after a reboot, but as long as the system is running 
with this error occured, it is sticky. So in order to compile the OS 
successfully, I reboot.

Does anyone have an idea what this could be? Since it affects at the moment 
only one machine (the other CoreDuo has been retired in the meanwhile), it 
feels a bit like a miscompilation on a certain type of CPU.

Thanks for your patience,

Oliver
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: [CURRENT]: weird memory/linker problem?

2014-07-01 Thread O. Hartmann
Am Mon, 23 Jun 2014 17:22:25 +0200
Dimitry Andric  schrieb:

> On 23 Jun 2014, at 16:31, O. Hartmann  wrote:
> > Am Sun, 22 Jun 2014 10:10:04 -0700
> > Adrian Chadd  schrieb:
> >> When they segfault, where do they segfault?
> ...
> > GIMP, LaTeX work, nothing special, but a bit memory consuming regrading 
> > GIMP) I tried
> > updating the ports tree and surprisingly the tree is left over in a unclean 
> > condition
> > while /usr/bin/svn segfault (on console: pid 18013 (svn), uid 0: exited on 
> > signal 11
> > (core dumped)).
> > 
> > Using /usr/local/bin/svn, which is from the devel/subversion port, performs 
> > well,
> > while FreeBSD 11's svn contribution dies as described. It did not hours ago!
> 
> I think what Adrian meant was: can you run svn (or another crashing
> program) in gdb, and post a backtrace?  Or maybe run ktrace, and see
> where it dies?
> 
> Alternatively, put a core dump and the executable (with debug info) in a
> tarball, and upload it somewhere, so somebody else can analyze it.
> 
> -Dimitry
> 

It's me again, with the same weird story.

After a couple of days silence, the mysterious entity in my computer is back. 
This time
it is again a weird compiler message of failure (trying to buildworld):

[...]
c++  -O2 -pipe -O3 -O3 
-I/usr/src/lib/clang/libllvmsupport/../../../contrib/llvm/include
-I/usr/src/lib/clang/libllvmsupport/../../../contrib/llvm/tools/clang/include
-I/usr/src/lib/clang/libllvmsupport/../../../contrib/llvm/lib/Support -I.
-I/usr/src/lib/clang/libllvmsupport/../../../contrib/llvm/../../lib/clang/include
-DLLVM_ON_UNIX -DLLVM_ON_FREEBSD -D__STDC_LIMIT_MACROS -D__STDC_CONSTANT_MACROS
-fno-strict-aliasing -DLLVM_DEFAULT_TARGET_TRIPLE=\"x86_64-unknown-freebsd11.0\"
-DLLVM_HOST_TRIPLE=\"x86_64-unknown-freebsd11.0\" -DDEFAULT_SYSROOT=\"\"
-Qunused-arguments -I/usr/obj/usr/src/tmp/legacy/usr/include -std=c++11 
-fno-exceptions
-fno-rtti -Wno-c++11-extensions
-c /usr/src/lib/clang/libllvmsupport/../../../contrib/llvm/lib/Support/Host.cpp 
-o Host.o
--- GraphWriter.o --- In file included
from 
/usr/src/lib/clang/libllvmsupport/../../../contrib/llvm/lib/Support/GraphWriter.cpp:14:
 
/usr/src/lib/clang/libllvmsupport/../../../contrib/llvm/include/llvm/Support/GraphWriter.h:269:10:
error: use of undeclared identifier 'DOD'; did you mean 'DOT'? O <<
DOD::EscapeString(Label); ^~~
DOT 
/usr/src/lib/clang/libllvmsupport/../../../contrib/llvm/include/llvm/Support/GraphWriter.h:35:11:
note: 'DOT' declared here namespace DOT {  // Private functions... ^ 1 error 
generated.
*** [GraphWriter.o] Error code 1


Well, in the past I saw many of those messages, especially not found labels of 
routines
in shared objects/libraries or even those "funny" misspelled messages shown 
above.

I can not reproduce them after a reboot, but as long as the system is running 
with this
error occured, it is sticky. So in order to compile the OS successfully, I 
reboot.

Does anyone have an idea what this could be? Since it affects at the moment 
only one
machine (the other CoreDuo has been retired in the meanwhile), it feels a bit 
like a
miscompilation on a certain type of CPU.

Thanks for your patience,

Oliver


signature.asc
Description: PGP signature


Re: [CURRENT]: weird memory/linker problem?

2014-06-25 Thread O. Hartmann
Am Mon, 23 Jun 2014 17:22:25 +0200
Dimitry Andric  schrieb:

> On 23 Jun 2014, at 16:31, O. Hartmann  wrote:
> > Am Sun, 22 Jun 2014 10:10:04 -0700
> > Adrian Chadd  schrieb:
> >> When they segfault, where do they segfault?
> ...
> > GIMP, LaTeX work, nothing special, but a bit memory consuming regrading 
> > GIMP) I tried
> > updating the ports tree and surprisingly the tree is left over in a unclean 
> > condition
> > while /usr/bin/svn segfault (on console: pid 18013 (svn), uid 0: exited on 
> > signal 11
> > (core dumped)).
> > 
> > Using /usr/local/bin/svn, which is from the devel/subversion port, performs 
> > well,
> > while FreeBSD 11's svn contribution dies as described. It did not hours ago!
> 
> I think what Adrian meant was: can you run svn (or another crashing
> program) in gdb, and post a backtrace?  Or maybe run ktrace, and see
> where it dies?
> 
> Alternatively, put a core dump and the executable (with debug info) in a
> tarball, and upload it somewhere, so somebody else can analyze it.
> 
> -Dimitry
> 

Here I am again.

So far, a report what I did. Regarding to the svn issue, I tried to
recompile " make -C usr.bin/svn clean depend obj all install" with setting "-O0 
-g
-DDEBUG" in /etc/make.conf and /etc/src.conf (disabling all the -O flags I use
usually). gdb complained about missing symbols. After the recompilation the 
onboard "svn"
didn't crash  anymore and the strange story seems to continue.

Firefox, so far, also crashed yesterday - out of the blue - with a bus error 
(SIG 10).
Rebooting solved the problem. I didn't recompile the system or any client with 
DEBUG
flags set on so far. So, sorry, this issue is still open, but it is not even 
less weird.


Next, today, I tried recompiling world. The build process fails on the box in 
question
with "my well known friend" relocation truncated to
fit: R_X86_64_PC32 against symbol error. See below.

I'm about to reboot the box and restart building world without having prior to 
the build
started any memory consuming applications.

Since the problems seem to be "randomly" I ask myself whether this is somehow 
related to
the ASLR stuff mentioned earlier in the list. I also will disable -O3 again 
with the
next build to ensure that CLANG isn't miscompilating something.

As mentioned in the list before, I tried to find some CPU-burning and memory 
eating
applications/tests, but since math/mprime is i386 only and sysutils/cpuburn 
only covers
"ancient" CPUs, I feel a bit lost in that task and leftover with memtest86 
(which
indicated earlier no memory problems with the box).

And by the way, I face several serious issues with the I/O performance on 
CURRENT these
days: it takes a long time until portmaster has stepped through the ports which 
are
about to be updated when CLANG compiler is compiling world/kernel in the 
background.
This phenomenon has grown worse since earlier this year (~ February). 

Source at revision 267867. FreeBSD 11.0-CURRENT #0 r267816: Tue Jun 24 14:02:22 
CEST 2014
amd64.

[...]
c++ -O2 -pipe -O3 -O3 
-I/usr/src/usr.bin/clang/tblgen/../../../contrib/llvm/include
-I/usr/src/usr.bin/clang/tblgen/../../../contrib/llvm/tools/clang/include
-I/usr/src/usr.bin/clang/tblgen/../../../contrib/llvm/utils/TableGen -I.
-I/usr/src/usr.bin/clang/tblgen/../../../contrib/llvm/../../lib/clang/include
-DLLVM_ON_UNIX -DLLVM_ON_FREEBSD -D__STDC_LIMIT_MACROS -D__STDC_CONSTANT_MACROS
-fno-strict-aliasing -DLLVM_DEFAULT_TARGET_TRIPLE=\"x86_64-unknown-freebsd11.0\"
-DLLVM_HOST_TRIPLE=\"x86_64-unknown-freebsd11.0\" -DDEFAULT_SYSROOT=\"\"
-Qunused-arguments -I/usr/obj/usr/src/tmp/legacy/usr/include -std=c++11 
-fno-exceptions
-fno-rtti -Wno-c++11-extensions  -static -L/usr/obj/usr/src/tmp/legacy/usr/lib 
-o tblgen
AsmMatcherEmitter.o AsmWriterEmitter.o AsmWriterInst.o CTagsEmitter.o
CallingConvEmitter.o CodeEmitterGen.o CodeGenDAGPatterns.o CodeGenInstruction.o
CodeGenMapTable.o CodeGenRegisters.o CodeGenSchedule.o CodeGenTarget.o 
DAGISelEmitter.o
DAGISelMatcher.o DAGISelMatcherEmitter.o DAGISelMatcherGen.o DAGISelMatcherOpt.o
DFAPacketizerEmitter.o DisassemblerEmitter.o FastISelEmitter.o 
FixedLenDecoderEmitter.o
InstrInfoEmitter.o IntrinsicEmitter.o OptParserEmitter.o PseudoLoweringEmitter.o
RegisterInfoEmitter.o SetTheory.o SubtargetEmitter.o TGValueTypes.o TableGen.o
X86DisassemblerTables.o X86ModRMFilters.o
X86RecognizableInstr.o 
/usr/obj/usr/src/tmp/usr/src/usr.bin/clang/tblgen/../../../lib/clang/libllvmtablegen/libllvmtablegen.a
 
/usr/obj/usr/src/tmp/usr/src/usr.bin/clang/tblgen/../../../lib/clang/libllvmsupport/libllvmsupport.a
-lncurses -legacy /usr/lib/libc.a(jemalloc_jemalloc.o): In function `imemalign':
jemalloc_jemalloc.c:(.text+0x2605): relocation truncated to fit: R_X86_64_PC32 
against
symbol `__je_arena_malloc_large' defined in .text section
in /usr/lib/libc.a(jemalloc_arena.o) c++: error: linker command failed with 
exit code 1
(use -v to see invocation) *** [tblgen] Error code 1

make[3]: stopped in /usr/src/usr.bin/clang/

Re: [CURRENT]: weird memory/linker problem?

2014-06-23 Thread O. Hartmann
Am Mon, 23 Jun 2014 09:27:46 -0600
Ian Lepore  schrieb:

> On Mon, 2014-06-23 at 16:31 +0200, O. Hartmann wrote:
> > 
> > I'm out of ideas. Is there a way to stress test the CPU and memory
> > system to check
> > whether RAM, the CPU itself and, as an additional possibility, the
> > disk i/o controller
> > (Intel ICH10)?
> > 
> > Thanks for your patience,
> 
> A really good tool for stress-testing a system is ports/math/mprime.  It
> will find memory and cpu errors that memtest86 and other tools
> completely overlook.  Run one copy per cpu, something like this:
> 
> for i in $(jot $(sysctl -n hw.ncpu) 0) ; do
> sleep $((i * 2)) && mprime -t -a$i >/tmp/mprime$i.log &
> done
> 
> Many overclockers use this to ensure the system is stable with the OC
> settings.  If your system can run a copy of mprime per cpu continuously
> for 24 hours the hardware is fine.
> 
> -- Ian

A great idea, but regretably I receive this error while trying to install that 
neat port:

mprime-0.0.24.14 is only for i386, while you are running amd64.
*** Error code 1

Is there a 64bit counterpart?

Oliver



signature.asc
Description: PGP signature


Re: [CURRENT]: weird memory/linker problem?

2014-06-23 Thread Ian Lepore
On Mon, 2014-06-23 at 16:31 +0200, O. Hartmann wrote:
> 
> I'm out of ideas. Is there a way to stress test the CPU and memory
> system to check
> whether RAM, the CPU itself and, as an additional possibility, the
> disk i/o controller
> (Intel ICH10)?
> 
> Thanks for your patience,

A really good tool for stress-testing a system is ports/math/mprime.  It
will find memory and cpu errors that memtest86 and other tools
completely overlook.  Run one copy per cpu, something like this:

for i in $(jot $(sysctl -n hw.ncpu) 0) ; do
sleep $((i * 2)) && mprime -t -a$i >/tmp/mprime$i.log &
done

Many overclockers use this to ensure the system is stable with the OC
settings.  If your system can run a copy of mprime per cpu continuously
for 24 hours the hardware is fine.

-- Ian


___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: [CURRENT]: weird memory/linker problem?

2014-06-23 Thread Dimitry Andric
On 23 Jun 2014, at 16:31, O. Hartmann  wrote:
> Am Sun, 22 Jun 2014 10:10:04 -0700
> Adrian Chadd  schrieb:
>> When they segfault, where do they segfault?
...
> GIMP, LaTeX work, nothing special, but a bit memory consuming regrading GIMP) 
> I tried
> updating the ports tree and surprisingly the tree is left over in a unclean 
> condition
> while /usr/bin/svn segfault (on console: pid 18013 (svn), uid 0: exited on 
> signal 11
> (core dumped)).
> 
> Using /usr/local/bin/svn, which is from the devel/subversion port, performs 
> well, while
> FreeBSD 11's svn contribution dies as described. It did not hours ago!

I think what Adrian meant was: can you run svn (or another crashing
program) in gdb, and post a backtrace?  Or maybe run ktrace, and see
where it dies?

Alternatively, put a core dump and the executable (with debug info) in a
tarball, and upload it somewhere, so somebody else can analyze it.

-Dimitry



signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: [CURRENT]: weird memory/linker problem?

2014-06-23 Thread O. Hartmann
Am Sun, 22 Jun 2014 10:10:04 -0700
Adrian Chadd  schrieb:

> When they segfault, where do they segfault?
> 
> 
> 
> -a

Now I get more fun.

After a buildworld and reboot, the box in question is at CURRENT:

FreeBSD 11.0-CURRENT #0 r267782: Mon Jun 23 13:12:56 CEST 2014 amd64

After a reboot, everything is/was all right. After reboot, I did an update of 
the ports
tree (I do this regularily). I configured /etc/make.conf that way, that ports 
tree update
is performed via using /usr/bin/svn. Now, ~ three hours of regular work 
(KDevelop, some
GIMP, LaTeX work, nothing special, but a bit memory consuming regrading GIMP) I 
tried
updating the ports tree and surprisingly the tree is left over in a unclean 
condition
while /usr/bin/svn segfault (on console: pid 18013 (svn), uid 0: exited on 
signal 11
(core dumped)).

Using /usr/local/bin/svn, which is from the devel/subversion port, performs 
well, while
FreeBSD 11's svn contribution dies as described. It did not hours ago!

Well, this drives me nuts. Is it a bug in FreeBSD (maybe relocating libs, the 
memory map
or something else) or is it in fact the agony of my computer system? As 
reported below,
memory checks via memtest didn't show up any kind of faulty memory.

I'm out of ideas. Is there a way to stress test the CPU and memory system to 
check
whether RAM, the CPU itself and, as an additional possibility, the disk i/o 
controller
(Intel ICH10)?

Thanks for your patience,

Oliver
 
> 
> 
> On 22 June 2014 07:56, O. Hartmann  wrote:
> >
> > Hello.
> >
> > I face a strange problem on a set of CURRENT driven boxes. The systems in 
> > question are
> > all the same version of CURRENT (more or less, a week or so discrepancy).
> >
> > The boxes affected have 8 GB of RAM and are old-style Core2Duo systems.
> >
> > The phenomenon:
> >
> > Starting up the box shows the operating system working. But sometimes it is
> > impossible to start certain applications, like Firefox - they segfault. More
> > disturbing is the fail of the linker when building world. Sometimes I get 
> > strange
> > messages like
> >
> > relocation truncated to fit: R_X86_64_PC32 against symbol `__error' defined 
> > in .text
> >
> > when compiling/linking. The funny thing is: rebooting the box and doing 
> > exactly the
> > same very often leaves the system then operable - starting applications 
> > works,
> > compiling works!
> >
> > First I thought this could be a indication of a dying system and so I 
> > checked the
> > memory for two days non-stop without any indication of anything wrong. The 
> > boxes do
> > not have ECC RAM - it's Intel.
> >
> > I see this problem on two C2D based boxes relatively often (one E8400 two 
> > core,
> > another Q6600 quadcore, both systems have 8 GB RAM). This phenomenon also 
> > occured two
> > or three months ago on another machine with 32 GB RAM and a Core-i7 3930K, 
> > but it
> > went away (it was the very same error as shown above).
> >
> > Another system, a i3-3220 with 16 GB RAM never showed the problem although 
> > that system
> > build world also on a regular basis very frequent as the C2D systems do.
> >
> > Well, I feel a bit confused. On the first view, the problem looks weird and 
> > it
> > indicates a kind of memory problem - but testing the memory didn't show 
> > anything
> > wrong.
> >
> > Today "windowmaker" stopped starting due to a malformed command in one of
> > windowmaker's library. I did reboot the box and everything was all right. 
> > Then, also
> > today, I tried compiling world and I got a weird error message about a 
> > misspelled
> > "Int__xxx", I can not remember exactly the text, I rebooted and everything 
> > was all
> > right again.
> >
> > Those errors are frequent on 8GB, C2D based systems and at the moment not 
> > present any
> > more on more modern systems with more memory as described above. This could 
> > be a
> > coincidence, but it is strange anyway.
> >
> > I do not exclude dying hardware, but I'd like to ask whether there is 
> > something
> > strange going on with FreeBSD's memory management at the moment and whether 
> > those
> > problems could also be triggered by some nasty bug? I never see a crash 
> > (which would
> > also indicated faulty hardware), I mostly realise those strange behaviour 
> > either
> > after a fresh boot or after I ran some memory disk i/o intensive jobs, like 
> > updating
> > the ports tree.
> >
> > By the way, FreeBSD CURRENT suffer from a tremendous performance cut these 
> > days when
> > compiling world and updating the ports tree and running portmaster. On one 
> > box, on
> > which ports reside on a UFS partion, it takes more than 8 minutes to pass 
> > the
> > portmaster -da, which is quick when not compiling world. On another system 
> > on
> > which /usr/ports is residing on ZFS (the box has 16GB RAM!), it takes 
> > sometimes 30(!)
> > minutes to perform a "svn update" while compiling world (that is the 
> > i3-3220 with 16
> > GB RAM system), it takes 6 - 15 minute

Re: [CURRENT]: weird memory/linker problem?

2014-06-23 Thread O. Hartmann
Am Sun, 22 Jun 2014 10:10:04 -0700
Adrian Chadd  schrieb:

> When they segfault, where do they segfault?
> 
> 
> 
> -a
> 
> 

I have not investigated this issue so far, since I was convinced - in the first 
place -
it is triggered by a defetive memory system. So I rebooted immediately being 
glad having
found a "solution".

I will check next time it happens again.

oh
> On 22 June 2014 07:56, O. Hartmann  wrote:
> >
> > Hello.
> >
> > I face a strange problem on a set of CURRENT driven boxes. The systems in 
> > question are
> > all the same version of CURRENT (more or less, a week or so discrepancy).
> >
> > The boxes affected have 8 GB of RAM and are old-style Core2Duo systems.
> >
> > The phenomenon:
> >
> > Starting up the box shows the operating system working. But sometimes it is
> > impossible to start certain applications, like Firefox - they segfault. More
> > disturbing is the fail of the linker when building world. Sometimes I get 
> > strange
> > messages like
> >
> > relocation truncated to fit: R_X86_64_PC32 against symbol `__error' defined 
> > in .text
> >
> > when compiling/linking. The funny thing is: rebooting the box and doing 
> > exactly the
> > same very often leaves the system then operable - starting applications 
> > works,
> > compiling works!
> >
> > First I thought this could be a indication of a dying system and so I 
> > checked the
> > memory for two days non-stop without any indication of anything wrong. The 
> > boxes do
> > not have ECC RAM - it's Intel.
> >
> > I see this problem on two C2D based boxes relatively often (one E8400 two 
> > core,
> > another Q6600 quadcore, both systems have 8 GB RAM). This phenomenon also 
> > occured two
> > or three months ago on another machine with 32 GB RAM and a Core-i7 3930K, 
> > but it
> > went away (it was the very same error as shown above).
> >
> > Another system, a i3-3220 with 16 GB RAM never showed the problem although 
> > that system
> > build world also on a regular basis very frequent as the C2D systems do.
> >
> > Well, I feel a bit confused. On the first view, the problem looks weird and 
> > it
> > indicates a kind of memory problem - but testing the memory didn't show 
> > anything
> > wrong.
> >
> > Today "windowmaker" stopped starting due to a malformed command in one of
> > windowmaker's library. I did reboot the box and everything was all right. 
> > Then, also
> > today, I tried compiling world and I got a weird error message about a 
> > misspelled
> > "Int__xxx", I can not remember exactly the text, I rebooted and everything 
> > was all
> > right again.
> >
> > Those errors are frequent on 8GB, C2D based systems and at the moment not 
> > present any
> > more on more modern systems with more memory as described above. This could 
> > be a
> > coincidence, but it is strange anyway.
> >
> > I do not exclude dying hardware, but I'd like to ask whether there is 
> > something
> > strange going on with FreeBSD's memory management at the moment and whether 
> > those
> > problems could also be triggered by some nasty bug? I never see a crash 
> > (which would
> > also indicated faulty hardware), I mostly realise those strange behaviour 
> > either
> > after a fresh boot or after I ran some memory disk i/o intensive jobs, like 
> > updating
> > the ports tree.
> >
> > By the way, FreeBSD CURRENT suffer from a tremendous performance cut these 
> > days when
> > compiling world and updating the ports tree and running portmaster. On one 
> > box, on
> > which ports reside on a UFS partion, it takes more than 8 minutes to pass 
> > the
> > portmaster -da, which is quick when not compiling world. On another system 
> > on
> > which /usr/ports is residing on ZFS (the box has 16GB RAM!), it takes 
> > sometimes 30(!)
> > minutes to perform a "svn update" while compiling world (that is the 
> > i3-3220 with 16
> > GB RAM system), it takes 6 - 15 minutes when the box is relaxed and 
> > updating the
> > ports tree the first time (every subsequent update is much faster).
> >
> > Well, I know these reports of mine are a bit weird since I have no exact 
> > log of the
> > problems, but I think if there is an issue not with the hardware, I report 
> > those in.
> >
> > Regards,
> >
> > oh




signature.asc
Description: PGP signature


Re: [CURRENT]: weird memory/linker problem?

2014-06-22 Thread Allan Jude
On 2014-06-22 10:56, O. Hartmann wrote:
> 
> Hello.
> 
> I face a strange problem on a set of CURRENT driven boxes. The systems in 
> question are
> all the same version of CURRENT (more or less, a week or so discrepancy).
> 
> The boxes affected have 8 GB of RAM and are old-style Core2Duo systems.
> 
> The phenomenon:
> 
> Starting up the box shows the operating system working. But sometimes it is 
> impossible to
> start certain applications, like Firefox - they segfault. More disturbing is 
> the fail of
> the linker when building world. Sometimes I get strange messages like
> 
> relocation truncated to fit: R_X86_64_PC32 against symbol `__error' defined 
> in .text
> 
> when compiling/linking. The funny thing is: rebooting the box and doing 
> exactly the same
> very often leaves the system then operable - starting applications works, 
> compiling works!
> 
> First I thought this could be a indication of a dying system and so I checked 
> the memory
> for two days non-stop without any indication of anything wrong. The boxes do 
> not have ECC
> RAM - it's Intel.
> 
> I see this problem on two C2D based boxes relatively often (one E8400 two 
> core, another
> Q6600 quadcore, both systems have 8 GB RAM). This phenomenon also occured two 
> or three
> months ago on another machine with 32 GB RAM and a Core-i7 3930K, but it went 
> away (it was
> the very same error as shown above).
> 
> Another system, a i3-3220 with 16 GB RAM never showed the problem although 
> that system
> build world also on a regular basis very frequent as the C2D systems do.
> 
> Well, I feel a bit confused. On the first view, the problem looks weird and 
> it indicates
> a kind of memory problem - but testing the memory didn't show anything wrong. 
> 
> Today "windowmaker" stopped starting due to a malformed command in one of 
> windowmaker's
> library. I did reboot the box and everything was all right. Then, also today, 
> I tried
> compiling world and I got a weird error message about a misspelled 
> "Int__xxx", I can not
> remember exactly the text, I rebooted and everything was all right again.
> 
> Those errors are frequent on 8GB, C2D based systems and at the moment not 
> present any
> more on more modern systems with more memory as described above. This could 
> be a
> coincidence, but it is strange anyway.
> 
> I do not exclude dying hardware, but I'd like to ask whether there is 
> something strange
> going on with FreeBSD's memory management at the moment and whether those 
> problems could
> also be triggered by some nasty bug? I never see a crash (which would also 
> indicated
> faulty hardware), I mostly realise those strange behaviour either after a 
> fresh boot or
> after I ran some memory disk i/o intensive jobs, like updating the ports tree.
> 
> By the way, FreeBSD CURRENT suffer from a tremendous performance cut these 
> days when
> compiling world and updating the ports tree and running portmaster. On one 
> box, on which
> ports reside on a UFS partion, it takes more than 8 minutes to pass the 
> portmaster -da,
> which is quick when not compiling world. On another system on which 
> /usr/ports is
> residing on ZFS (the box has 16GB RAM!), it takes sometimes 30(!) minutes to 
> perform a
> "svn update" while compiling world (that is the i3-3220 with 16 GB RAM 
> system), it takes
> 6 - 15 minutes when the box is relaxed and updating the ports tree the first 
> time (every
> subsequent update is much faster).
> 
> Well, I know these reports of mine are a bit weird since I have no exact log 
> of the
> problems, but I think if there is an issue not with the hardware, I report 
> those in.
> 
> Regards,
> 
> oh
> 

In order to get a better benchmark for 'svn update' on the ports tree

if you 'zfs unmount pool/usr/ports' it will flush all ARC entries for
that dataset, then 'zfs mount pool/usr/ports' and run the test again.
This should give you more reproducible results

-- 
Allan Jude



signature.asc
Description: OpenPGP digital signature


Re: [CURRENT]: weird memory/linker problem?

2014-06-22 Thread Adrian Chadd
When they segfault, where do they segfault?



-a


On 22 June 2014 07:56, O. Hartmann  wrote:
>
> Hello.
>
> I face a strange problem on a set of CURRENT driven boxes. The systems in 
> question are
> all the same version of CURRENT (more or less, a week or so discrepancy).
>
> The boxes affected have 8 GB of RAM and are old-style Core2Duo systems.
>
> The phenomenon:
>
> Starting up the box shows the operating system working. But sometimes it is 
> impossible to
> start certain applications, like Firefox - they segfault. More disturbing is 
> the fail of
> the linker when building world. Sometimes I get strange messages like
>
> relocation truncated to fit: R_X86_64_PC32 against symbol `__error' defined 
> in .text
>
> when compiling/linking. The funny thing is: rebooting the box and doing 
> exactly the same
> very often leaves the system then operable - starting applications works, 
> compiling works!
>
> First I thought this could be a indication of a dying system and so I checked 
> the memory
> for two days non-stop without any indication of anything wrong. The boxes do 
> not have ECC
> RAM - it's Intel.
>
> I see this problem on two C2D based boxes relatively often (one E8400 two 
> core, another
> Q6600 quadcore, both systems have 8 GB RAM). This phenomenon also occured two 
> or three
> months ago on another machine with 32 GB RAM and a Core-i7 3930K, but it went 
> away (it was
> the very same error as shown above).
>
> Another system, a i3-3220 with 16 GB RAM never showed the problem although 
> that system
> build world also on a regular basis very frequent as the C2D systems do.
>
> Well, I feel a bit confused. On the first view, the problem looks weird and 
> it indicates
> a kind of memory problem - but testing the memory didn't show anything wrong.
>
> Today "windowmaker" stopped starting due to a malformed command in one of 
> windowmaker's
> library. I did reboot the box and everything was all right. Then, also today, 
> I tried
> compiling world and I got a weird error message about a misspelled 
> "Int__xxx", I can not
> remember exactly the text, I rebooted and everything was all right again.
>
> Those errors are frequent on 8GB, C2D based systems and at the moment not 
> present any
> more on more modern systems with more memory as described above. This could 
> be a
> coincidence, but it is strange anyway.
>
> I do not exclude dying hardware, but I'd like to ask whether there is 
> something strange
> going on with FreeBSD's memory management at the moment and whether those 
> problems could
> also be triggered by some nasty bug? I never see a crash (which would also 
> indicated
> faulty hardware), I mostly realise those strange behaviour either after a 
> fresh boot or
> after I ran some memory disk i/o intensive jobs, like updating the ports tree.
>
> By the way, FreeBSD CURRENT suffer from a tremendous performance cut these 
> days when
> compiling world and updating the ports tree and running portmaster. On one 
> box, on which
> ports reside on a UFS partion, it takes more than 8 minutes to pass the 
> portmaster -da,
> which is quick when not compiling world. On another system on which 
> /usr/ports is
> residing on ZFS (the box has 16GB RAM!), it takes sometimes 30(!) minutes to 
> perform a
> "svn update" while compiling world (that is the i3-3220 with 16 GB RAM 
> system), it takes
> 6 - 15 minutes when the box is relaxed and updating the ports tree the first 
> time (every
> subsequent update is much faster).
>
> Well, I know these reports of mine are a bit weird since I have no exact log 
> of the
> problems, but I think if there is an issue not with the hardware, I report 
> those in.
>
> Regards,
>
> oh
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"