On Thu, 29 Jan 2009, Rainer Gerhards wrote:

> On Thu, 2009-01-29 at 00:36 -0800, [email protected] wrote:
>> On Wed, 28 Jan 2009, Rainer Gerhards wrote:
>>
>>> Hi all,
>>>
>>> thanks to Lorenzo's help, we made good progress. It is too much to post
>>> inside a mail, please have a look at my analysis of the bug:
>>>
>>> http://blog.gerhards.net/2009/01/rsyslog-data-race-analysis.html
>>>
>>> The short story is that we have at least improved the situation very
>>> much and I hope to have fixes for all branches within the next couple of
>>> days.
>>
>> I just finished reading through this excellant write-up
>>
>> one small thing.
>>
>> you quote the spec
>>
>> Accesses to cacheable memory that are split across bus widths, cache
>> lines, and page boundaries are not guaranteed to be atomic
>>
>> and then conclude that
>>
>> So aligned word-access does not guarantee (not even enhance the chance) of
>> atomicity.
>>
>> I read that to mean that the alignment requirements are more complicated,
>> not that alignment is useless.
>
> I should probably have quoted more of Intel's manual. But in essence you
> need to read at least the first full two pages to get the in-depth idea.
> The issue is not alignment requirements. As hardware gets more and more
> parallel, and caches get to more and more levels, and on-chip cores
> coexist with those from other sockets ... keeping memory coherent is a
> costly job.
>
> In early CPUs, Intel made memory access atomic if some alignment
> requirements were met. That was cheap. In new CPUs that atomicity is
> expensive. On the other hand, most data access do not need atomicity. So
> why incur the cost for many operations when only few need it? In the end
> result, Intel has remove guaranteed atomicity from those memory
> accesses. In order to get atomicity, the program must tell the CPU
> *explicitly* that it wants that feature. To do so, a "LOCK" prefix
> (opcode) must be placed before the actual opcode (note that this is only
> supported for some operations). So you get the best of two world: fast
> execution time for the majority of code and atomicity where you need it
> (but it then incurs the cost).
>
> The bottom line is that what was an atomic operation on an old CPU is no
> longer an atomic operation on a new CPU. If you need that, you need to
> include that extra "LOCK" opcode.
>
> As I briefly said in the blogpost, I have not check old Intel manuals.
> So I do not know if they formerly guaranteed, as part of the instruction
> set architecture, that these operations were atomic. I guess they did
> not. If so, I as a programmer made some assumptions about the
> micro-architecture that no longer hold true. My fault... But even if it
> is Intel's fault, the C programming language does not guarantee
> atomicity nor does the compiler guarantee a specific translation to
> machine code. So I, working on the C level, used assumptions that were
> not valid (and as I said I knew it was dangerous, but it worked too well
> for too long... ;))

the new C0x standard will add atomic ops and guarentees (some of which are 
not nessasarily provided by the chip, but have to be provided by the 
compiler/library instead), so watch for it, but test the performance of 
them before you trust them

>>
>> you should also look at the code that's generated by -Os, with the heavily
>> cached systems that we have nowdays it's common that the code being
>> smaller (and therefor more of the code fitting into the L1 cache) is more
>> of an advantage than the optimizations that -O3 provides.
>
> That's a good reminder. I've just checked the gcc docs. There are some
> things that I do not like about -Os, especially as it disables proper
> alignment of many structures, including code. That can lead to
> sub-optimal cache performance.

I know the linux kernel has many things where the alignment is critical 
for proper functioning, but they are still able to support -Os, so there 
is some way to specify alignment even for -Os

> On the other hand -O3 does things like loop unrolling, which definitely
> is a bad idea with modern cache systems.
>
> My preliminarily conclusion is that -O2 is probably best, and may be
> tuned by turning on and off specific optimizations via their specific
> compiler switches.

this has been the prevailing wisdom for many years, but I've seen myself 
many cases where -Os has ended up being faster in the real world, in spite 
of the various things that -O2 does 'better'

is it the case that -Os would break things? or just that you think it's 
alignment may not be as good?

David Lang

>> congradulations on tracking down a nasty and subtle issue.
>
> Thanks - but let's first see if this was the only issue and if things
> run smooth everywhere. But it looks very promising.
>
> Rainer
>>
>> David Lang
>>
>>
>>> Rainer
>>>
>>>> -----Original Message-----
>>>> From: [email protected] [mailto:rsyslog-
>>>> [email protected]] On Behalf Of Rainer Gerhards
>>>> Sent: Friday, January 16, 2009 3:22 PM
>>>> To: rsyslog-users
>>>> Subject: Re: [rsyslog] rsyslog still crashes
>>>>
>>>> Lorenzo,
>>>>
>>>> I have created a new branch "raceDebug" and done a first commit to it.
>>>> The change is very lightweight. Please pull, compile as usual and give
>>>> it a try. It spits out some info to stdout from time to time
>>>> (hopefully). I am not sure if it aborts, depending on the output it
>>> may
>>>> or may not. Even if we get messages, they are probably not enough to
>>>> pinpoint the bug, but I wanted to do something very light to see if
>>> the
>>>> bug stays.
>>>>
>>>> Feedback appreciated.
>>>>
>>>> Rainer
>>> _______________________________________________
>>> rsyslog mailing list
>>> http://lists.adiscon.net/mailman/listinfo/rsyslog
>>> http://www.rsyslog.com
>>>
>> _______________________________________________
>> rsyslog mailing list
>> http://lists.adiscon.net/mailman/listinfo/rsyslog
>> http://www.rsyslog.com
>
> _______________________________________________
> rsyslog mailing list
> http://lists.adiscon.net/mailman/listinfo/rsyslog
> http://www.rsyslog.com
>
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com

Reply via email to