Hi Sean. I'm not aware of anyone working on AVX-512, but it would be nice if the AMD folks could chime in and confirm that. The x86 microcode was originally based off of the microcode for the K6 as described in a patent. The floating point parts of that patent were very vague and hand wavy, so I more or less made up the initial part. It would be nice for the AMD folks to chime in here too, as far as what's realistic for the design of the microops.
As far as testing, we don't have a great scheme for testing individual instructions right now, but that would be really valuable to have in the long run. I've thought a bit about how that might work, but I don't have a plan at the moment. The best thing to do right now is to probably to have small programs that execute the instructions in question and print their inputs/outputs and/or check that the outputs are correct. I think our testing framework has a way to check that program output matches a golden reference, and that could be used to delegate correctness checking to the framework. Bobby can probably give more details here. As far as the registers, my preference for now is to do what you did and treat each 64 bit chunk as its own register. There are real drawbacks to this approach, but the existing solution to them, a vector register file, has other, in my opinion more serious, drawbacks. A while ago I put together a manifesto about how I'd want to redo the whole register handling mechanism in gem5, but unfortunately I haven't had time to actually implement very much of it. By treating larger registers as groups of smaller registers, you'd be consistent with the rest of the x86 code as it stands right now. That, and the fact that I think that's the lesser of two evils, makes that my preferred way to go. As far as submitting code, there are instructions on the gem5 website for creating and submitting reviews. We use gerrit, and so in addition to the instructions we provide, you should be able to find pretty good/complete instructions out on the internet to explain the mechanism of sending out a review. For this or any other change, you'd want to break up your work into logical chunks where everything works before and after any given change, and then send them out (perhaps all together in a series) for review. Exactly how to break things up is up to you, but my opinion is that each change should be logically complete but also about one thing. That makes it easier for a reviewer to wrap their head around what you're doing and how it works without having to untangle multiple things going on at once, or having to merge multiple reviews together in their head to see the whole change their reviewing. If there are lots of related small changes (many individual instructions for instance) it might make sense to do one or two by themselves first, and then once the kinks are worked out to do a larger change with the rest, applying the pattern from the earlier reviews. Gabe On Sun, May 31, 2020 at 4:18 PM Sean Wong via gem5-dev <gem5-dev@gem5.org> wrote: > Hello, > > This is my first time posting here, so apologies if I made any mistakes. > > The last time I checked the develop branch, gem5 has not yet supported the > AVX512. And searching the mail list I do not see any plan for that. Is > there any ongoing development to support that? If not, I am happy to > contribute my code. During my research, I have developed partial support > for AVX512 (and AVX-256 as a by-product), which I hope would be useful for > others. > > My implementation so far is a straightforward extension to the existing SSE > instructions. To summarize it: > > - Like SSE implementation, the 512-bit register is broken into 8 64-bit > sub-register. This may not be a good design. Any suggestions are welcome. > - Unlike SSE implementation, most of the instructions are broken into a > single microop. For example, a 512-bit 'vaddps' is decoded into one 'vaddf' > microop instead of eight. > - Currently, it supports common arithmetic instructions (add, mul, etc.) > and basic data movement (load, store, mov, extract, insert, etc.). > - No support for masking. > > If you guys are interested, I am willing to clean my code and submit for > review. I may need some guidance on: > > - Design of the vector register file. My implementation directly follows > the SSE instructions to minimize the work. Is there any better way to do > this? > - If I am going to merge my code, what is a good submission plan? I am > thinking about first committing the skeleton code with a simple 'vaddps' > instruction, and then for other instructions. > - Testing: This is probably the most important one. Currently, I manually > test my code by simulating small programs. What is the best way to write > tests for new instructions? Should I try unit testing for binary testing? > > Thank you for reading this long post. Any feedback is welcome. > > *王 钲 荣* > > Zhengrong Wang > Computer Science Department > University of California, Los Angeles > California, USA > 90024 > > Work Email: seanyukig...@gmail.com > Mobile :+1 310-447-4568 <(310)%20447-4568> > _______________________________________________ > gem5-dev mailing list -- gem5-dev@gem5.org > To unsubscribe send an email to gem5-dev-le...@gem5.org > %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s _______________________________________________ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s