jaykang10 added a comment.
In https://reviews.llvm.org/D30810#718176, @Anastasia wrote:
> LGTM! Thanks!
Thanks Anastasia! Merge Done!
Repository:
rL LLVM
https://reviews.llvm.org/D30810
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
This revision was automatically updated to reflect the committed changes.
Closed by commit rL299445: Preserve vec3 type. (authored by jaykang10).
Changed prior to commit:
https://reviews.llvm.org/D30810?vs=92829=94088#toc
Repository:
rL LLVM
https://reviews.llvm.org/D30810
Files:
Anastasia accepted this revision.
Anastasia added a comment.
This revision is now accepted and ready to land.
LGTM! Thanks!
https://reviews.llvm.org/D30810
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
Anastasia added a comment.
In https://reviews.llvm.org/D30810#710069, @jaykang10 wrote:
> > I believe the argument lacks numbers (or at least you have them but didn't
> > mention). I didn't hear about performance results, or validation that this
> > was actually tested for correctness. Small
jaykang10 added a comment.
> I believe the argument lacks numbers (or at least you have them but didn't
> mention). I didn't hear about performance results, or validation that this
> was actually tested for correctness. Small test cases prove a point, but
> can't be considered general.
>
>
bruno added a comment.
> I would like to see this patch committed. I see clear evidence of it
> improving existing GPU targets in the master repo as well as outside of the
> main tree implementations. Bruno, do you have any more concerns?
I believe the argument lacks numbers (or at least you
jaykang10 updated this revision to Diff 92829.
jaykang10 added a comment.
Preserved vec3 type on __builtin_astype.
https://reviews.llvm.org/D30810
Files:
include/clang/Driver/CC1Options.td
include/clang/Frontend/CodeGenOptions.def
lib/CodeGen/CGExpr.cpp
lib/CodeGen/CGExprScalar.cpp
jaykang10 added a comment.
> Yes. This would make sense. I am guessing that in vec3->vec4, we will have 3
> loads and 4 stores and in vec4->vec3 we will have 4 loads and 3 stores?
It depends on implementation. If you scalarize all vector operations on LLVM IR
level before entering llvm's
Anastasia added a comment.
In https://reviews.llvm.org/D30810#707176, @jaykang10 wrote:
> In https://reviews.llvm.org/D30810#706677, @Anastasia wrote:
>
> > In https://reviews.llvm.org/D30810#706676, @Anastasia wrote:
> >
> > > In https://reviews.llvm.org/D30810#699428, @Anastasia wrote:
> > >
>
jaykang10 added a comment.
In https://reviews.llvm.org/D30810#706677, @Anastasia wrote:
> In https://reviews.llvm.org/D30810#706676, @Anastasia wrote:
>
> > In https://reviews.llvm.org/D30810#699428, @Anastasia wrote:
> >
> > > Would you be able to update ScalarExprEmitter::VisitAsTypeExpr
> >
Anastasia added a comment.
In https://reviews.llvm.org/D30810#706676, @Anastasia wrote:
> In https://reviews.llvm.org/D30810#699428, @Anastasia wrote:
>
> > Would you be able to update ScalarExprEmitter::VisitAsTypeExpr
> > implementation accordingly to make things consistent?
>
>
> Not sure it
Anastasia added a comment.
In https://reviews.llvm.org/D30810#699428, @Anastasia wrote:
> Would you be able to update ScalarExprEmitter::VisitAsTypeExpr implementation
> accordingly to make things consistent?
Not sure it got lost somewhere... do you think you could address this too?
Anastasia added a comment.
> Hi Anastasia, Bruno,
>
> Do you still have other opinion? or Can we go ahead and commit this patch?
I would like to see this patch committed. I see clear evidence of it improving
existing GPU targets in the master repo as well as outside of the main tree
jaykang10 added a comment.
In https://reviews.llvm.org/D30810#702637, @jaykang10 wrote:
> In https://reviews.llvm.org/D30810#702614, @Anastasia wrote:
>
> > In https://reviews.llvm.org/D30810#702443, @bruno wrote:
> >
> > > > As a result, I think it would be good for clang to have both of
> > >
jaykang10 added a comment.
In https://reviews.llvm.org/D30810#702614, @Anastasia wrote:
> In https://reviews.llvm.org/D30810#702443, @bruno wrote:
>
> > > As a result, I think it would be good for clang to have both of features
> > > and I would like to stick to the option "-fpresereve-vec3' to
jaykang10 added a comment.
In https://reviews.llvm.org/D30810#702548, @ahatanak wrote:
> In https://reviews.llvm.org/D30810#701141, @jaykang10 wrote:
>
> > In https://reviews.llvm.org/D30810#701132, @ahatanak wrote:
> >
> > > Actually, it's not a mis-compile. The record layout shows that there
Anastasia added a comment.
In https://reviews.llvm.org/D30810#702443, @bruno wrote:
> > As a result, I think it would be good for clang to have both of features
> > and I would like to stick to the option "-fpresereve-vec3' to change the
> > behavior easily.
>
> The motivation doesn't seem
ahatanak added a comment.
In https://reviews.llvm.org/D30810#701141, @jaykang10 wrote:
> In https://reviews.llvm.org/D30810#701132, @ahatanak wrote:
>
> > Actually, it's not a mis-compile. The record layout shows that there is a
> > padding before field f2 and f2 starts at byte 16. So using
jaykang10 added a comment.
> The motivation doesn't seem solid to me, who else is going to benefit from
> this flag? You also didn't explain why doing this transformation yourself
> (looking through the shuffle) on your downstream pass isn't enough for you.
> We generally try to avoid adding
bruno added a comment.
> As you can see, the type legalizer handle vec3 load/store properly. It does
> not write 4th element. The vec3 load/store generates more instructions but it
> has correct behavior. I am not 100% sure the vec3 --> vec4 load/store is
> correct or not because no one has
jaykang10 added a comment.
In https://reviews.llvm.org/D30810#701861, @Anastasia wrote:
> I don't think there is anything wrong with the generation of vec3->vec4 in
> Clang. I believe the motivation for this was the OpenCL spec treating vec3 as
> vec4 aligned type (see section 6.1.5:
>
Anastasia added a comment.
I don't think there is anything wrong with the generation of vec3->vec4 in
Clang. I believe the motivation for this was the OpenCL spec treating vec3 as
vec4 aligned type (see section 6.1.5:
https://www.khronos.org/registry/OpenCL/specs/opencl-2.0-openclc.pdf#12). So
jaykang10 added a comment.
> I believe the assumption is more practical: most part of upstream llvm
> targets only support vectors with even sized number of lanes. And in those
> cases you would have to expand to a 4x vector and leave the 4th element as
> undef anyway, so it was done in the
bruno added a comment.
> Thanks for your comment. We have a pass to undo the vec4 to vec3. I wondered
> why clang generates the vec4 for vec3 load/store. As you can see the comment
> on clang's code, they are generated for better performance. I had a questions
> at this point. clang should
jaykang10 added a comment.
In https://reviews.llvm.org/D30810#701132, @ahatanak wrote:
> Actually, it's not a mis-compile. The record layout shows that there is a
> padding before field f2 and f2 starts at byte 16. So using "store <4 x
> float>" doesn't overwrite the field.
It depends on
ahatanak added a comment.
Actually, it's not a mis-compile. The record layout shows that there is a
padding before field f2 and f2 starts at byte 16. So using "store <4 x float>"
doesn't overwrite the field.
https://reviews.llvm.org/D30810
___
ahatanak added a comment.
It looks like converting vec3 to vec4 is incorrect in some cases. In the
following program, IRGen emits "store <4 x float>" to store g1 to *a, which
will overwrite s1.f2.
typedef __attribute__((__ext_vector_type__(3))) float float3;
struct S1 {
float3 f1;
jaykang10 added a comment.
In https://reviews.llvm.org/D30810#700476, @Anastasia wrote:
> In https://reviews.llvm.org/D30810#699827, @jaykang10 wrote:
>
> > In https://reviews.llvm.org/D30810#699695, @bruno wrote:
> >
> > > Hi JinGu,
> > >
> > > I just read the discussion on cfe-dev, some
Anastasia added a comment.
In https://reviews.llvm.org/D30810#699827, @jaykang10 wrote:
> In https://reviews.llvm.org/D30810#699695, @bruno wrote:
>
> > Hi JinGu,
> >
> > I just read the discussion on cfe-dev, some comments:
> >
> > > My colleague and I are implementing a transformation pass
jaykang10 added a comment.
In https://reviews.llvm.org/D30810#699695, @bruno wrote:
> Hi JinGu,
>
> I just read the discussion on cfe-dev, some comments:
>
> > My colleague and I are implementing a transformation pass between LLVM IR
> > and another IR and we want to keep the 3-component vector
bruno added a reviewer: bruno.
bruno added a comment.
Hi JinGu,
I just read the discussion on cfe-dev, some comments:
> My colleague and I are implementing a transformation pass between LLVM IR and
> another IR and we want to keep the 3-component vector types in our target IR
Why can't you go
jaykang10 added a comment.
In https://reviews.llvm.org/D30810#699428, @Anastasia wrote:
> Would you be able to update ScalarExprEmitter::VisitAsTypeExpr implementation
> accordingly to make things consistent?
Probably, No... the load/store with vec3 generates vec4 temporarily to be
aligned
Anastasia added a comment.
Would you be able to update ScalarExprEmitter::VisitAsTypeExpr implementation
accordingly to make things consistent?
https://reviews.llvm.org/D30810
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
jaykang10 marked an inline comment as done.
jaykang10 added a comment.
In https://reviews.llvm.org/D30810#699006, @ahatanak wrote:
> Could you elaborate on the motivation for this change?
>
> I was wondering why clang (CodeGen) needed the help of a command line option
> to decide whether vec3
ahatanak added a comment.
Could you elaborate on the motivation for this change?
I was wondering why clang (CodeGen) needed the help of a command line option to
decide whether vec3 should be converted to vec4. Can it just preserve vec3 when
the architecture is spir?
jaykang10 added a comment.
In https://reviews.llvm.org/D30810#697760, @Anastasia wrote:
> Could you please add your test here (probably goes to test/CodeGenOpenCL)?
Oops!
I am so sorry. I missed it. I have updated it.
https://reviews.llvm.org/D30810
jaykang10 updated this revision to Diff 91355.
jaykang10 added a comment.
Changed help text for option and Added test file.
https://reviews.llvm.org/D30810
Files:
include/clang/Driver/CC1Options.td
include/clang/Frontend/CodeGenOptions.def
lib/CodeGen/CGExpr.cpp
Anastasia added a comment.
Could you please add your test here (probably goes to test/CodeGenOpenCL)?
Comment at: include/clang/Driver/CC1Options.td:661
+def fpreserve_vec3_type : Flag<["-"], "fpreserve-vec3-type">,
+ HelpText<"Preserve 3-component vector type operations">;
+
jaykang10 updated this revision to Diff 91307.
jaykang10 added a comment.
Fixed typo.
https://reviews.llvm.org/D30810
Files:
include/clang/Driver/CC1Options.td
include/clang/Frontend/CodeGenOptions.def
lib/CodeGen/CGExpr.cpp
lib/Frontend/CompilerInvocation.cpp
Index:
jaykang10 updated this revision to Diff 91306.
jaykang10 added a comment.
Added -f prefix to option name.
https://reviews.llvm.org/D30810
Files:
include/clang/Driver/CC1Options.td
include/clang/Frontend/CodeGenOptions.def
lib/CodeGen/CGExpr.cpp
lib/Frontend/CompilerInvocation.cpp
jaykang10 created this revision.
Preserve vec3 type with CodeGen option.
https://reviews.llvm.org/D30810
Files:
include/clang/Driver/CC1Options.td
include/clang/Frontend/CodeGenOptions.def
lib/CodeGen/CGExpr.cpp
lib/Frontend/CompilerInvocation.cpp
test/CodeGenOpenCL/preserve_vec3.cl
41 matches
Mail list logo