Re: [PATCH] Atom: Scheduler improvements for better imul placement

Richard Guenther Thu, 12 Apr 2012 06:55:16 -0700

2012/4/12 Andrey Belevantsev <a...@ispras.ru>:
> On 12.04.2012 16:38, Richard Guenther wrote:
>>
>> On Thu, Apr 12, 2012 at 2:36 PM, Igor Zamyatin<izamya...@gmail.com>
>>  wrote:
>>>
>>> On Thu, Apr 12, 2012 at 4:24 PM, Richard Guenther
>>> <richard.guent...@gmail.com>  wrote:
>>>>
>>>> On Thu, Apr 12, 2012 at 2:00 PM, Alexander Monakov<amona...@ispras.ru>
>>>>  wrote:
>>>>>
>>>>>
>>>>>> Can atom execute two IMUL in parallel?  Or what exactly is the
>>>>>> pipeline
>>>>>> behavior?
>>>>>
>>>>>
>>>>> As I understand from Intel's optimization reference manual, the
>>>>> behavior is as
>>>>> follows: if the instruction immediately following IMUL has shorter
>>>>> latency,
>>>>> execution is stalled for 4 cycles (which is IMUL's latency); otherwise,
>>>>> a
>>>>> 4-or-more cycles latency instruction can be issued after IMUL without a
>>>>> stall.
>>>>> In other words, IMUL is pipelined with respect to other long-latency
>>>>> instructions, but not to short-latency instructions.
>>>>
>>>>
>>>> It seems to be modeled in the pipeline description though:
>>>>
>>>> ;;; imul insn has 5 cycles latency
>>>> (define_reservation "atom-imul-32"
>>>>                    "atom-imul-1, atom-imul-2, atom-imul-3, atom-imul-4,
>>>>                     atom-port-0")
>>>>
>>>> ;;; imul instruction excludes other non-FP instructions.
>>>> (exclusion_set "atom-eu-0, atom-eu-1"
>>>>               "atom-imul-1, atom-imul-2, atom-imul-3, atom-imul-4")
>>>>
>>>
>>> The main idea is quite simple:
>>>
>>> If we are going to schedule IMUL instruction (it is on the top of
>>> ready list) we try to find out producer of other (independent) IMUL
>>> instruction that is in ready list too. The goal is try to schedule
>>> such a producer to get another IMUL in ready list and get scheduling
>>> of 2 successive IMUL instructions.
>>
>>
>> Why does that not happen without your patch?  Does it never happen without
>> your patch or does it merely not happen for one EEMBC benchmark (can
>> you provide a testcase?)?
>
>
> It does not happen because the scheduler by itself does not do such specific
> reordering.  That said, it is easy to imagine the cases where this patch
> will make things worse rather than better.


That surprises me.  What is so specific about this reordering?

> Igor, why not try different subtler mechanisms like adjust_priority, which
> is get called when an insn is added to the ready list?  E.g. increase the
> producer's priority.
>
> The patch as is misses checks for NONDEBUG_INSN_P.  Also, why bail out when
> you have more than one imul in the ready list?  Don't you want to bump the
> priority of the other imul found?
>
> Andrey
>
>
>>
>>> And MD allows us only prefer scheduling of successive IMUL instructions,
>>> i.e.
>>> If IMUL was just scheduled and ready list contains another IMUL
>>> instruction then it will be chosen as the best candidate for
>>> scheduling.
>>>
>>>
>>>> at least from my very limited guessing of what the above does.  So, did
>>>> you
>>>> analyze why the scheduler does not properly schedule things for you?
>>>>
>>>> Richard.
>>>>
>>>>>
>>>>>  From reading the patch, I could not understand the link between
>>>>> pipeline
>>>>> behavior and what the patch appears to do.
>>>>>
>>>>> Hope that helps.
>>>>>
>>>>> Alexander
>
>

Re: [PATCH] Atom: Scheduler improvements for better imul placement

Reply via email to