I have fixed according to your comment. And will send out new version very soon. But for the ConstantInt and UndefValue issue, I would like leave it as it is now. I will improve it later. Just as we talked, it is better if we can process Instruction and ConstantExpr in a unified way.
> -----Original Message----- > From: Zhigang Gong [mailto:[email protected]] > Sent: Saturday, October 11, 2014 9:18 AM > To: Song, Ruiling > Cc: [email protected] > Subject: Re: [Beignet] [PATCH 2/3] GBE: add legalize pass to handle wide > integers > > Thanks for the patch. It solves a long standing issue. A few comments as > below: > > On Fri, Oct 10, 2014 at 03:01:26PM +0800, Ruiling Song wrote: > > This legalize pass will break wider integers like i128/i256/... into shorter > ones. > > The problem is how to choose the shorter type? From my observation, > > wide integer type always comes from shorter ones through 'zext' on > > small type or 'bitcast' on vectors, so we simply choose the type where it > comes from. > > Then we can split wide integer operations into operations on shorter > interger. > > > > Signed-off-by: Ruiling Song <[email protected]> > > --- > > + > > + void splitAPInt(APInt &data, SmallVectorImpl<APInt> &result, int > totalBits, int subBits) { > > + APInt lo = data.getLoBits(totalBits/2).trunc(totalBits/2); > > + APInt hi = data.getHiBits(totalBits/2).trunc(totalBits/2); > > + > > + if (totalBits/2 <= subBits) { > > + result.push_back(lo); > > + result.push_back(hi); > > + return; > > + } > > + splitAPInt(lo, result, totalBits/2, subBits); > > + splitAPInt(hi, result, totalBits/2, subBits); } > > + > > + void Legalize::splitLargeInteger(APInt data, Type *splitTy, > SmallVector<APInt, 16> &split) { > > + unsigned opSz = data.getBitWidth(); > > + unsigned subSz = splitTy->getPrimitiveSizeInBits(); > > + splitAPInt(data, split, opSz, subSz); } > The above algorithm assumes the opSz is an power of 2 bits, right? maybe > put an assert here to make sure this assumption is better. > > + > > + void Legalize::legalizeICmp(IRBuilder<> &Builder, Instruction *p) { > > + ICmpInst *IC = dyn_cast<ICmpInst>(p); > > + ICmpInst::Predicate pred = IC->getPredicate(); > > + // I could not figure out why llvm could generate some > > + // compare instruction on large integers. so here only support > > + equality check > Agree with your thought here. An assert here is good enough. > > > + GBE_ASSERT(IC->isEquality()); > > + Value *op0 = p->getOperand(0); > > + Value *op1 = p->getOperand(1); > > + > > + if (isa<ConstantInt>(op0)) { > > + op0 = p->getOperand(1); > > + op1 = p->getOperand(0); > > + } > Is it possible that both operands are constant? If it is the case, we just > need > to ignore it here. > > > + void Legalize::legalizeAnd(IRBuilder<> &Builder, Instruction *p) { > > + Value *op0 = p->getOperand(0); > > + Value *op1 = p->getOperand(1); > > + > > + if ((isa<UndefValue>(op0) || isa<UndefValue>(op1))) { > > + // I meet some special case as below: > > + // %82 = zext i32 %81 to i512 > > + // %mask148 = and i512 undef, -4294967296 > > + // %ins149 = or i512 %mask148, %82 > > + // I don't how to split this kind of i512 instruction in a good > > + way, > miss a know here? > And yes, I also found some similar cases, and some of the root causes are a > buggy kernel which use uninitialized value. But we still need to compile those > buggy kernel. My question here is we may need to add this logic to most of > the other operands handlers? This special case may not be And only case. > > > + > > + bool Legalize::legalizeFunction(Function &F) { > > + bool changed = false; > > + for (Function::iterator bb = F.begin(), bbE = F.end(); bb != bbE; > > ++bb) { > > + IRBuilder<> Builder(bb); > > + for (BasicBlock::iterator it = bb->begin(), itE = bb->end(); it != > > itE; > ++it) { > > + Instruction *insn = it; > > + Type *ty = insn->getType(); > > + if(ty->isIntegerTy() && ty->getIntegerBitWidth() > 64) { > > + // result is large integer, push back itself and its users > > + changed = true; > > + > > + processed.insert(insn); > > + > > + for(Value::use_iterator iter = insn->use_begin(); iter != > insn->use_end(); ++iter) { > > + // After LLVM 3.5, use_iterator points to 'Use' instead of > 'User', which is more straightforward. > > + #if (LLVM_VERSION_MAJOR == 3) && > (LLVM_VERSION_MINOR < 5) > > + User *theUser = *iter; > > + #else > > + User *theUser = iter->getUser(); > > + #endif > > + processed.insert(theUser); > > + } > > + } > > + > > + if(processed.empty() || processed.find(insn) == processed.end()) > > + continue; > > + > The above code has an implict assumption that all the wide values should be > generated before all the uses. Right? > This assumption may not be correct. Please check the following example: > > ... (there is no definition of %7) > br L2 > > L1: > %10 = add i128 %7, %6 > ... > ret > > L2: > %8 = load <4 x i32> addrspace(1)* %3, align 4, !tbaa !1 > %7 = bitcast <4 x i32> %8 to i128 > ... > br L1 > > Thanks, > Zhigang Gong. _______________________________________________ Beignet mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/beignet
