Hey Zhengrong,
Thanks for getting started on this! I've also cc'd Miquel at BSC who
has implemented many of the x86 vector instructions. Miquel, it would
be great to get your input here!
As far as getting input from AMD folks... I think this is going to be
a tough thing for them to weigh in on due to IP issues. This is
getting a bit too close to their products :). They can correct me if
I'm wrong!
To answer your questions:
- Design of the vector register file. My implementation directly
follows
the SSE instructions to minimize the work. Is there any better
way to do
this?
I agree with Gabe. This is the best approach for now.
- If I am going to merge my code, what is a good submission plan?
I am
thinking about first committing the skeleton code with a simple
'vaddps'
instruction, and then for other instructions.
That sounds good to me. If you think the whole set of changes should
be reviewed together or there's no way to split things apart and
still be understandable, we can create a feature branch for you. That
said, since this is mostly just adding one instruction and doesn't
touch too much outside of the ISA implementation, just breaking it up
that way will probably work.
- Testing: This is probably the most important one. Currently, I
manually
test my code by simulating small programs. What is the best way
to write
tests for new instructions? Should I try unit testing for binary
testing?
If you could submit your programs to the gem5-resources repo, we can
build the binaries and then distribute them for anyone to use for
testing.
I think that works well.
Cheers,
Jason
On Sun, May 31, 2020 at 10:14 PM Gabe Black via gem5-dev
<[email protected] <mailto:[email protected]>> wrote:
https://docs.google.com/document/d/1O_u_Xq14TgreYThuZcbM3kuXFCrKvaFHA2O9poCeHSk/edit#heading=h.r067bn3rmydo
On Sun, May 31, 2020 at 9:31 PM Zhengrong Wang via gem5-dev <
[email protected] <mailto:[email protected]>> wrote:
> Hi Gabe,
>
> Thanks for your reply. For the vector register file, I agree it
is probably
> a better idea to stick with current approach, at least it does
not require
> changing the SSE instructions. I cound not find your plan to
redesign the
> register handling mechanism. If you could provide a link I would be
> interested to take a look to have better understanding of the
philosophy
> behind the design.
>
> Let's hear from AMD first as they have more insights about the
microop. If
> everything turns out well, I can start to refactor the code
into smaller
> commits and add tests for that.
>
> *王 钲 荣*
>
> Zhengrong Wang
> Computer Science Department
> University of California, Los Angeles
> California, USA
> 90024
>
> Work Email: [email protected] <mailto:[email protected]>
> Mobile :+1 310-447-4568 <(310)%20447-4568>
>
>
>
>
> Gabe Black via gem5-dev <[email protected]
<mailto:[email protected]>> 于2020年5月31日周日 下午7:44写道:
>
> > Hi Sean. I'm not aware of anyone working on AVX-512, but it
would be nice
> > if the AMD folks could chime in and confirm that. The x86
microcode was
> > originally based off of the microcode for the K6 as described
in a
> patent.
> > The floating point parts of that patent were very vague and
hand wavy,
> so I
> > more or less made up the initial part. It would be nice for
the AMD folks
> > to chime in here too, as far as what's realistic for the
design of the
> > microops.
> >
> > As far as testing, we don't have a great scheme for testing
individual
> > instructions right now, but that would be really valuable to
have in the
> > long run. I've thought a bit about how that might work, but I
don't have
> a
> > plan at the moment. The best thing to do right now is to
probably to have
> > small programs that execute the instructions in question and
print their
> > inputs/outputs and/or check that the outputs are correct. I
think our
> > testing framework has a way to check that program output
matches a golden
> > reference, and that could be used to delegate correctness
checking to the
> > framework. Bobby can probably give more details here.
> >
> > As far as the registers, my preference for now is to do what
you did and
> > treat each 64 bit chunk as its own register. There are real
drawbacks to
> > this approach, but the existing solution to them, a vector
register file,
> > has other, in my opinion more serious, drawbacks. A while ago
I put
> > together a manifesto about how I'd want to redo the whole
register
> handling
> > mechanism in gem5, but unfortunately I haven't had time to
actually
> > implement very much of it. By treating larger registers as
groups of
> > smaller registers, you'd be consistent with the rest of the
x86 code as
> it
> > stands right now. That, and the fact that I think that's the
lesser of
> two
> > evils, makes that my preferred way to go.
> >
> > As far as submitting code, there are instructions on the gem5
website for
> > creating and submitting reviews. We use gerrit, and so in
addition to the
> > instructions we provide, you should be able to find pretty
good/complete
> > instructions out on the internet to explain the mechanism of
sending out
> a
> > review. For this or any other change, you'd want to break up
your work
> into
> > logical chunks where everything works before and after any
given change,
> > and then send them out (perhaps all together in a series) for
review.
> > Exactly how to break things up is up to you, but my opinion
is that each
> > change should be logically complete but also about one thing.
That makes
> it
> > easier for a reviewer to wrap their head around what you're
doing and how
> > it works without having to untangle multiple things going on
at once, or
> > having to merge multiple reviews together in their head to
see the whole
> > change their reviewing. If there are lots of related small
changes (many
> > individual instructions for instance) it might make sense to
do one or
> two
> > by themselves first, and then once the kinks are worked out
to do a
> larger
> > change with the rest, applying the pattern from the earlier
reviews.
> >
> > Gabe
> >
> > On Sun, May 31, 2020 at 4:18 PM Sean Wong via gem5-dev <
> [email protected] <mailto:[email protected]>>
> > wrote:
> >
> > > Hello,
> > >
> > > This is my first time posting here, so apologies if I made any
> mistakes.
> > >
> > > The last time I checked the develop branch, gem5 has not
yet supported
> > the
> > > AVX512. And searching the mail list I do not see any plan
for that. Is
> > > there any ongoing development to support that? If not, I am
happy to
> > > contribute my code. During my research, I have developed
partial
> support
> > > for AVX512 (and AVX-256 as a by-product), which I hope
would be useful
> > for
> > > others.
> > >
> > > My implementation so far is a straightforward extension to
the existing
> > SSE
> > > instructions. To summarize it:
> > >
> > > - Like SSE implementation, the 512-bit register is broken
into 8 64-bit
> > > sub-register. This may not be a good design. Any
suggestions are
> welcome.
> > > - Unlike SSE implementation, most of the instructions are
broken into a
> > > single microop. For example, a 512-bit 'vaddps' is decoded
into one
> > 'vaddf'
> > > microop instead of eight.
> > > - Currently, it supports common arithmetic instructions
(add, mul,
> etc.)
> > > and basic data movement (load, store, mov, extract, insert,
etc.).
> > > - No support for masking.
> > >
> > > If you guys are interested, I am willing to clean my code
and submit
> for
> > > review. I may need some guidance on:
> > >
> > > - Design of the vector register file. My implementation
directly
> follows
> > > the SSE instructions to minimize the work. Is there any
better way to
> do
> > > this?
> > > - If I am going to merge my code, what is a good submission
plan? I am
> > > thinking about first committing the skeleton code with a simple
> 'vaddps'
> > > instruction, and then for other instructions.
> > > - Testing: This is probably the most important one.
Currently, I
> manually
> > > test my code by simulating small programs. What is the best
way to
> write
> > > tests for new instructions? Should I try unit testing for
binary
> testing?
> > >
> > > Thank you for reading this long post. Any feedback is welcome.
> > >
> > > *王 钲 荣*
> > >
> > > Zhengrong Wang
> > > Computer Science Department
> > > University of California, Los Angeles
> > > California, USA
> > > 90024
> > >
> > > Work Email: [email protected]
<mailto:[email protected]>
> > > Mobile :+1 310-447-4568 <(310)%20447-4568> <(310)%20447-4568>
> > > _______________________________________________
> > > gem5-dev mailing list -- [email protected]
<mailto:[email protected]>
> > > To unsubscribe send an email to [email protected]
<mailto:[email protected]>
> > > %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
> > _______________________________________________
> > gem5-dev mailing list -- [email protected]
<mailto:[email protected]>
> > To unsubscribe send an email to [email protected]
<mailto:[email protected]>
> > %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
> _______________________________________________
> gem5-dev mailing list -- [email protected]
<mailto:[email protected]>
> To unsubscribe send an email to [email protected]
<mailto:[email protected]>
> %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
_______________________________________________
gem5-dev mailing list -- [email protected] <mailto:[email protected]>
To unsubscribe send an email to [email protected]
<mailto:[email protected]>
%(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s