Re: [gentoo-user] SegFault while compiling gcc 4.1.1

2006-11-30 Thread Etaoin Shrdlu
On Thursday 30 November 2006 07:56, Vladimir G. Ivanovic wrote:

 Let's take a poll.

 1. Have you seen this error message in an emerge?

Yes, several times.

 2. Have you subsequently identified a hardware problem, fixed the
 hardware problem, and have not seen the message since?

Yes. 99% of the times it was bad RAM (verified with memtest86).
Of course, for trivial emerges a subsequent emerge completed fine, but 
the first failure put me on the alert.

 3. Have you re-run the emerge and not seen the message in a while
 (please indicate how long a while is.)

For me, a while is since fixing the hardware problem.

 BTW, do you know portage/emerge/make/whatever knows that the problem
 is not reproducible?

If, all other things being equal, a subsequent attempt at the same 
operation does not exhibit the problem, or fails differently, there's a 
good chance that the problem is not reproducible.
-- 
gentoo-user@gentoo.org mailing list



Re: [gentoo-user] SegFault while compiling gcc 4.1.1

2006-11-30 Thread Richard Fish

On 11/29/06, Vladimir G. Ivanovic [EMAIL PROTECTED] wrote:

1. Have you seen this error message in an emerge?


Yes.


2. Have you subsequently identified a hardware problem, fixed the
hardware problem, and have not seen the message since?


Yes.  The problem was memory timings...or more specifically the RAM
didn't really work as fast as its manufacturer claimed.  Dropping the
memory timings, and later replacing the RAM, fixed the problem.


3. Have you re-run the emerge and not seen the message in a while
(please indicate how long a while is.)


No.
--
gentoo-user@gentoo.org mailing list



Re: [gentoo-user] SegFault while compiling gcc 4.1.1

2006-11-30 Thread Mick
On Thursday 30 November 2006 09:02, Etaoin Shrdlu wrote:
 On Thursday 30 November 2006 07:56, Vladimir G. Ivanovic wrote:
  Let's take a poll.
 
  1. Have you seen this error message in an emerge?

 Yes, several times.

Ditto.

  2. Have you subsequently identified a hardware problem, fixed the
  hardware problem, and have not seen the message since?

 Yes. 99% of the times it was bad RAM (verified with memtest86).
 Of course, for trivial emerges a subsequent emerge completed fine, but
 the first failure put me on the alert.

Yes, it was a bad/incompatible RAM module for the particular memory 
controller.  memetest86 did not identify it, I found out by trial and error!  
The symptom was random crashes (mostly) during emerge which would complete 
fine after a hard reboot.  The crashes would invariably happen either when 
the memory controller was switching onto the next memory module, or when both 
modules were used up and it started using swap.

  3. Have you re-run the emerge and not seen the message in a while
  (please indicate how long a while is.)

 For me, a while is since fixing the hardware problem.

Ditto, i.e. about 18 months so far.

HTH.
-- 
Regards,
Mick


pgpy2Wqda1NFT.pgp
Description: PGP signature


Re: [gentoo-user] SegFault while compiling gcc 4.1.1

2006-11-30 Thread Vladimir G. Ivanovic

Etaoin Shrdlu wrote:

On Thursday 30 November 2006 07:56, Vladimir G. Ivanovic wrote:


Let's take a poll.

1. Have you seen this error message in an emerge?


Yes, several times.


2. Have you subsequently identified a hardware problem, fixed the
hardware problem, and have not seen the message since?


Yes. 99% of the times it was bad RAM (verified with memtest86).
Of course, for trivial emerges a subsequent emerge completed fine, but 
the first failure put me on the alert.



3. Have you re-run the emerge and not seen the message in a while
(please indicate how long a while is.)


For me, a while is since fixing the hardware problem.


BTW, do you know portage/emerge/make/whatever knows that the problem
is not reproducible?


If, all other things being equal, a subsequent attempt at the same 
operation does not exhibit the problem, or fails differently, there's a 
good chance that the problem is not reproducible.


Interesting responses from 3 people. But ...

I have done nothing to my hardware and I've seen this error, oh, a 
half a dozen times, the last time 3 months (?) ago. I ran memtest when 
I installed new memory, and it did not report problems even when run 
for hours. And I do not get random segfaults with other programs. 
Finally, I don't think my hardware fixed itself.


Given all of this, my suspicion is that these errors are software 
bugs, not hardware problems.


The other thing that I don't really believe is the part about this 
bug not being reproducible as reported by portage/emerge/make/gcc. I 
don't recall any evidence that the emerge that actually tried the 
compilation again and /succeeded/. (Why then error out rather than 
print a warning message like, Compilation retry succeeded on 
subsequent attempt; hardware problem suspected.) So, my suspicion 
that the commentary is bogus; but I believe the part about internal 
compiler error: Segfault.


--- Vladimir


--
gentoo-user@gentoo.org mailing list



Re: [gentoo-user] SegFault while compiling gcc 4.1.1

2006-11-30 Thread Richard Fish

On 11/30/06, Vladimir G. Ivanovic [EMAIL PROTECTED] wrote:

I have done nothing to my hardware and I've seen this error, oh, a
half a dozen times, the last time 3 months (?) ago. I ran memtest when
I installed new memory, and it did not report problems even when run
for hours.


memtest is basically useless these days.  It can only tell you if you
have a bad memory cell, which almost never happens today.  Most memory
problems are the result of timing issues between the processor(s) and
DMA controllers.

This script [1] seems to be a much better memory test for modern
systems, although you may have to make some tweaks to run it on
Gentoo.


And I do not get random segfaults with other programs.


Yes, compiling is very unique in this regard.  The memory access
pattern of a compiler, reading and writing to locations on different
rows, or even different modules, under high CPU load and using lots of
memory, with some IO thrown in for good measure, tends to reveal
hardware problems quite nicely.


Finally, I don't think my hardware fixed itself.

Given all of this, my suspicion is that these errors are software
bugs, not hardware problems.


If we were talking about a driver, or an event-based GUI program, I
might agree.  But a compiler is going to take the exact same actions
given the same input and options.  The compiler isn't going to do
something different between 2 different executions over the _exact_
same sources because it feels like it.



The other thing that I don't really believe is the part about this
bug not being reproducible as reported by portage/emerge/make/gcc.


Then you should read the gcc sources.  One of the patches applied by
Gentoo adds a retry loop when the compiler is about to exit with an
internal compiler error (ICE).  It retries the compile twice, and if
either of those succeeds, you get the The bug is not reproducible
message.  It doesn't output anything because that would possibly
obscure the original error.

The gentoo devs probably added this loop to avoid more duplicates of [2].

-Richard

[1] http://people.redhat.com/dledford/memtest.html
[2] http://bugs.gentoo.org/show_bug.cgi?id=20600
--
gentoo-user@gentoo.org mailing list



Re: [gentoo-user] SegFault while compiling gcc 4.1.1

2006-11-30 Thread Vladimir G. Ivanovic

Richard Fish wrote:

On 11/30/06, Vladimir G. Ivanovic [EMAIL PROTECTED] wrote:

I have done nothing to my hardware and I've seen this error, oh, a
half a dozen times, the last time 3 months (?) ago. I ran memtest when
I installed new memory, and it did not report problems even when run
for hours.


memtest is basically useless these days.  It can only tell you if you
have a bad memory cell, which almost never happens today.  Most memory
problems are the result of timing issues between the processor(s) and
DMA controllers.

This script [1] seems to be a much better memory test for modern
systems, although you may have to make some tweaks to run it on
Gentoo.


Just for kicks I'll run the script and see what happens.




And I do not get random segfaults with other programs.


Yes, compiling is very unique in this regard.  The memory access
pattern of a compiler, reading and writing to locations on different
rows, or even different modules, under high CPU load and using lots of
memory, with some IO thrown in for good measure, tends to reveal
hardware problems quite nicely.


Finally, I don't think my hardware fixed itself.

Given all of this, my suspicion is that these errors are software
bugs, not hardware problems.


For grins, here is part of comment #174:

Random segfaults during compilation. ... in general a sign of
hardware problems.

// No, this is in general a sign of GCC 4.1 - problem ;-)


If we were talking about a driver, or an event-based GUI program, I
might agree.  But a compiler is going to take the exact same actions
given the same input and options.  The compiler isn't going to do
something different between 2 different executions over the _exact_
same sources because it feels like it.


You're right at the logical level, but not at the physical level. 
Cache effects and different disk accesses are two physical differences 
that spring to mind. Temporary files will be in different physical 
sectors, or in the buffer cache or not; directories may or may not be 
in the directory cache. Depending on what else is running, the pattern 
of cache misses will be different.


I emerge with -j2. Plus I'm doing work while the emerges happen. The 
likelihood of the memory access pattern of two compiles being the same 
is precisely zero.






The other thing that I don't really believe is the part about this
bug not being reproducible as reported by portage/emerge/make/gcc.


Then you should read the gcc sources.  One of the patches applied by
Gentoo adds a retry loop when the compiler is about to exit with an
internal compiler error (ICE).  It retries the compile twice, and if
either of those succeeds, you get the The bug is not reproducible
message.  


Interesting. I did not know that. But I don't get why gcc exits with 
an error when the second (or third) try succeeds? Why not just print a 
warning, perhaps at the end so it is noticeable? Most people will 
restart the entire emerge, which seems like a gargantuan amount of 
wasted effort since the re-compilation has succeeded.



It doesn't output anything because that would possibly
obscure the original error.

The gentoo devs probably added this loop to avoid more duplicates of [2].

-Richard

[1] http://people.redhat.com/dledford/memtest.html
[2] http://bugs.gentoo.org/show_bug.cgi?id=20600


--
gentoo-user@gentoo.org mailing list



Re: [gentoo-user] SegFault while compiling gcc 4.1.1

2006-11-29 Thread Vladimir G. Ivanovic

Etaoin Shrdlu wrote:

On Thursday 23 November 2006 14:39, Leandro Melo de Sales wrote:

The bug is not reproducible, so it is likely a hardware or OS problem.

   ^

As the message says, you might have a hardware problem (usually bad RAM 
or CPU). 


My experience has been that it NEVER is a hardware problem. The next 
emerge of the same package always completes successfully.


--- Vladimir
--
gentoo-user@gentoo.org mailing list



Re: [gentoo-user] SegFault while compiling gcc 4.1.1

2006-11-29 Thread Raymond Lewis Rebbeck
On Thursday, 30 November 2006 16:16, Vladimir G. Ivanovic wrote:
 Etaoin Shrdlu wrote:
  On Thursday 23 November 2006 14:39, Leandro Melo de Sales wrote:
  The bug is not reproducible, so it is likely a hardware or OS problem.
 
 ^
 
  As the message says, you might have a hardware problem (usually bad RAM
  or CPU).

 My experience has been that it NEVER is a hardware problem. The next
 emerge of the same package always completes successfully.


That behaviour usually indicates a hardware problem. Random unexplainable 
segfaults that you can't reproduce.

-- 
Raymond Lewis Rebbeck
-- 
gentoo-user@gentoo.org mailing list



Re: [gentoo-user] SegFault while compiling gcc 4.1.1

2006-11-29 Thread Vladimir G. Ivanovic

Raymond Lewis Rebbeck wrote:

On Thursday, 30 November 2006 16:16, Vladimir G. Ivanovic wrote:

Etaoin Shrdlu wrote:

On Thursday 23 November 2006 14:39, Leandro Melo de Sales wrote:

The bug is not reproducible, so it is likely a hardware or OS problem.

   ^

As the message says, you might have a hardware problem (usually bad RAM
or CPU).

My experience has been that it NEVER is a hardware problem. The next
emerge of the same package always completes successfully.



That behaviour usually indicates a hardware problem. Random unexplainable 
segfaults that you can't reproduce.


Let's take a poll.

1. Have you seen this error message in an emerge?
2. Have you subsequently identified a hardware problem, fixed the
   hardware problem, and have not seen the message since?
3. Have you re-run the emerge and not seen the message in a while
   (please indicate how long a while is.)

For me, the answers are:
1. Yes
2. No
3. Yes (~months)

BTW, do you know portage/emerge/make/whatever knows that the problem 
is not reproducible?


--- Vladimir
--
gentoo-user@gentoo.org mailing list



Re: [gentoo-user] SegFault while compiling gcc 4.1.1

2006-11-23 Thread Etaoin Shrdlu
On Thursday 23 November 2006 14:39, Leandro Melo de Sales wrote:

 Hi,

  I'm trying to compile GCC 4.1.1-r1 but I got the following error
 message:
[cut]
 'main': /var/tmp/portage/gcc-4.1.1-r1/work/gcc-4.1.1/gcc/gcc.c:8043:
 internal compiler error: Segmentation fault
 Please submit a full bug report,
 with preprocessed source if appropriate.
 See URL:http://bugs.gentoo.org/ for instructions.
 The bug is not reproducible, so it is likely a hardware or OS problem.
   ^

As the message says, you might have a hardware problem (usually bad RAM 
or CPU). With this kind of errors, before submitting any bug reports, 
try reemerging gcc: if it works, or breaks again but in another point 
with the same error, it's very likely that the problem lies in your 
hardware. There is a good article by Daniel Robbins which explains how 
to troubleshoot such errors.

http://www.gentoo.org/doc/en/articles/hardware-stability-p1.xml
-- 
gentoo-user@gentoo.org mailing list



Re: [gentoo-user] SegFault while compiling gcc 4.1.1

2006-11-23 Thread Leandro Melo de Sales

Thank you, I'll read it.

[]s
Leandro

2006/11/23, Etaoin Shrdlu [EMAIL PROTECTED]:

On Thursday 23 November 2006 14:39, Leandro Melo de Sales wrote:

 Hi,

  I'm trying to compile GCC 4.1.1-r1 but I got the following error
 message:
[cut]
 'main': /var/tmp/portage/gcc-4.1.1-r1/work/gcc-4.1.1/gcc/gcc.c:8043:
 internal compiler error: Segmentation fault
 Please submit a full bug report,
 with preprocessed source if appropriate.
 See URL:http://bugs.gentoo.org/ for instructions.
 The bug is not reproducible, so it is likely a hardware or OS problem.
   ^

As the message says, you might have a hardware problem (usually bad RAM
or CPU). With this kind of errors, before submitting any bug reports,
try reemerging gcc: if it works, or breaks again but in another point
with the same error, it's very likely that the problem lies in your
hardware. There is a good article by Daniel Robbins which explains how
to troubleshoot such errors.

http://www.gentoo.org/doc/en/articles/hardware-stability-p1.xml
--
gentoo-user@gentoo.org mailing list


--
gentoo-user@gentoo.org mailing list