On Tue, Sep 15, 2015 at 7:00 AM, He Junyan <junyan...@inbox.com> wrote: > On Tue, Sep 15, 2015 at 06:00:57AM -0700, Matt Turner wrote: >> Date: Tue, 15 Sep 2015 06:00:57 -0700 >> From: Matt Turner <matts...@gmail.com> >> To: "junyan.he" <junyan...@inbox.com> >> Cc: "beignet@lists.freedesktop.org" <beignet@lists.freedesktop.org> >> Subject: Re: [Beignet] [PATCH 6/8] Backend: Implement FDIV64 on BDW. >> >> On Tue, Sep 15, 2015 at 4:15 AM, <junyan...@inbox.com> wrote: >> > From: Junyan He <junyan...@linux.intel.com> >> > >> > According to the document, we use a set of instructions >> > to implement double type division. >> > >> > Signed-off-by: Junyan He <junyan...@linux.intel.com> >> > --- >> > backend/src/backend/gen8_context.cpp | 68 >> > ++++++++++++++++++++++++++++++++++++ >> > backend/src/backend/gen8_context.hpp | 2 ++ >> > 2 files changed, 70 insertions(+) >> > >> > diff --git a/backend/src/backend/gen8_context.cpp >> > b/backend/src/backend/gen8_context.cpp >> > index b497ee5..f465832 100644 >> > --- a/backend/src/backend/gen8_context.cpp >> > +++ b/backend/src/backend/gen8_context.cpp >> > @@ -924,6 +924,74 @@ namespace gbe >> > this->unpackLongVec(src, dst, p->curr.execWidth); >> > } >> > >> > + void Gen8Context::emitF64DIVInstruction(const SelectionInstruction >> > &insn) { >> > + /* Macro for Double Precision IEEE Compliant fdiv >> > + >> > + Set Rounding Mode in CR to RNE >> > + GRF are initialized: r0 = 0, r6 = a, r7 = b, r1 = 1 >> > + The default data type for the macro is :df >> > + >> > + math.eo.f0.0 (4) r8.acc2 r6.noacc r7.noacc 0xE >> > + (-f0.0) if >> > + madm (4) r9.acc3 r0.noacc r6.noacc r8.acc2 // Step(1), >> > q0=a*y0 >> > + madm (4) r10.acc4 r1.noacc -r7.noacc r8.acc2 // Step(2), >> > e0=(1-b*y0) >> > + madm (4) r11.acc5 r6.noacc -r7.noacc r9.acc3 // Step(3), >> > r0=a-b*q0 >> > + madm (4) r12.acc6 r8.acc2 r10.acc4 r8.acc2 // Step(4), >> > y1=y0+e0*y0 >> > + madm (4) r13.acc7 r1.noacc -r7.noacc r12.acc6 // Step(5), >> > e1=(1-b*y1) >> > + madm (4) r8.acc8 r8.acc2 r10.acc4 r12.acc6 // Step(6), >> > y2=y0+e0*y1 >> > + madm (4) r9.acc9 r9.acc3 r11.acc5 r12.acc6 // Step(7), >> > q1=q0+r0*y1 >> > + madm (4) r12.acc2 r12.acc6 r8.acc8 r13.acc7 // Step(8), >> > y3=y1+e1*y2 >> > + madm (4) r11.acc3 r6.noacc -r7.noacc r9.acc9 // Step(9), >> > r1=a-b*q1 >> > + >> > + Change Rounding Mode in CR if required >> > + Implicit Accumulator for destination is NULL >> > + >> > + madm (4) r8.noacc r9.acc9 r11.acc3 r12.acc2 // Step(10), >> > q=q1+r1*y3 >> > + endif */ >> >> I don't see an IF or an ENDIF instruction emitted in the code below. >> Is that intentional, or am I misreading the code? >> > Here, we use f0.1 as the predication for all the instructions, like: > (-f0.1) madm (4) r9.acc3 r0.noacc r6.noacc r8.acc2 > (-f0.1) madm (4) r10.acc4 r1.noacc -r7.noacc r8.acc2 > ..... > I avoid using IF-Endif here, because we need to calculate the instruction > number > within IF clause, and it is not convenient.
Ah, I see. While that works, I think it does not take advantage of the "early out" capability of the INVM math instruction. As I understand it, for some input values, it can calculate a full double-precision value without any of the MADM sequence, so using IF/ENDIF will allow the EU to jump over all of the MADM instructions -- but if you just predicate the instructions the EU cannot jump over them, it must send each down the pipeline. Just something to consider. I don't know whether the difficulties of using IF/ENDIF are great enough to avoid using them. _______________________________________________ Beignet mailing list Beignet@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/beignet