Hi Jason and others, We (the PLCT Lab) are delighted to receive the rapid feedback from the community! And we are glad to see that there are so many contributors in the open source community participating in the RVV support on gem5. We want to give thanks and respectation to other contributors. We also hope to work together with the community to advance the development of RVV support faster and better.
I think Zoom meetings are helpful for collaborative development, but our English (especially speaking) is not so good, so there may be some communication difficulties in the voice meeting. Maybe we can maintain long-term communication over some IM? I think Slack is a good option, but if you guys have any other preferred chat software, we'd love to use it too. We are honored that Jason believes collaboration should be based on our implementation posted on Gerrit. We are currently developing on GitHub ( github.com/plctlab/plct-gem5), PRs are very welcome! As for the instruction support, it is true that the implementation of RIVOS is more complete than our current implementation. We are very willing to cooperate with RIVOS to complete the follow-up instruction implementation. And as for the configuration support of VLEN, We hope to have some discussion. We believe that it is necessary to make VLEN configurable. We found that RIVOS has added support for it in the compilation phase. But we think it might be better to support this configuration at runtime (via python) as in Spike/QEMU. But we haven't yet found a way to do it. Is this possible in gem5? And passing vtype/vl via PCState is indeed a better solution if it can implement support for the Timing model without hacking the CPU code. We are looking forward to further progress with this solution! We're honored that the test repo ( https://github.com/huxuan0307/riscv-vector-tests) has your praise. More peer approval is required before integrating it in gem5-resources, we think. At present, this repository is experimental and unsteady. And there are still bugs to fix. We're glad if this repo is helpful for your development. In the next steps, we intend to focus on the development and improvement of the unit tests repository (https://github.com/huxuan0307/riscv-vector-tests) and continue to explore the implementation of some new instruction formats under microinstructions (such as Widening and Narrowing instructions). In the future, we could have more discussions on division of cooperation. We hope we don't have duplicate work in cooperative development. Thanks again to all the contributors! Regards, Yang Liu Jason Lowe-Power <ja...@lowepower.com> 于2022年5月21日周六 02:32写道: > Hi everyone, > > I'm super excited to see all of the activity around RISC-V vector > instructions right now. However, it looks like there are a few different > implementations being worked on, and it's a good idea to try to unify > around a single implementation and work together to get to a point where > everyone in the gem5 community can benefit from this support. > > Before going any further, I want to give a huge thanks to everyone that > has been working on this and has made contributions to varying different > implementations. I'm not going to try to name people (I'm certain I missed > some in the cc line!), but I hope everyone knows that we appreciate their > contributions to the project! > > Before diving into details of the code, if there's interest from the > community I can set up a meeting time for us all to get together on zoom to > chat about details and the best way to work together. > > Looking at the code ( > https://gem5-review.googlesource.com/c/public/gem5/+/59789) and the > documentation ( > https://docs.google.com/document/d/1yUDPU9NvpKo1WM1WYfdx20_aXLnlHssUUsDYR4lu95Q/edit) > recently submitted, I think there are many great things about this > approach, and a couple of places that we should discuss potential ways to > improve it. > > First, I think that using microcode is definitely the right way to enable > configurable VLEN and to get timing memory accesses to work. Because of > this, I believe that the code posted to gerrit is probably the best > starting point for collaboration. Happy to hear other opinions, though. > > Note that the Rivos implementation on github ( > https://github.com/rivosinc/gem5/tree/rivos/dev/joy/initial_RVV_support) > does not use microcoded instructions, so it only works in atomic mode. > However, I believe this implementation may have more instructions > implemented than the one on gerrit. Also, in this implementation the VLEN > is a parameter of the ISA which allows users to configure the system > dynamically (which is great!). We should try to find a way to merge these > two implementations. > > Second, we should integrate the tests ( > https://github.com/huxuan0307/riscv-vector-tests) into gem5-resources > ASAP. This is a fabulous contribution! Having tests for vector insts will > enable much faster development. > > I would like to discuss one design decision in the gerrit code: > Specifically how the vtype/vl is set in the decoder. Stalling the decoder > to get the correct vtype/vl when vset*vl* is executed doesn't fit well with > gem5's execution model, and it feels like a bit of a hack. > > I have an alternative proposal that I would like to hear your thoughts on. > Instead of storing vtype/vl in the decoder, we could store it in the > PCState. Then, the vset*vl* instruction would look a lot like a control > instruction. At decode time, the next PC state could be set with some > values (maybe wrong values, just like the next pc after a branch may be > wrong) or if it is a vsetivli, then the next PC state would have the > correct values. Then, the subsequent instructions could access the PC state > to get the current vtype/vl. > > In the execute stage of the vset*vl*, it would set the next pc state > correctly. The CPU models already check to see if the next PC is the same > in execute as it was "predicted" in the decode stage (i.e., was the branch > predicted correctly). We can leverage this to check to see if vtype/vl are > correct. If not, the CPU models will simply squash and re-execute starting > at the correct next pc (i.e., the next vector instruction will execute the > correct vtype/vl after vset*vl* is executed). If we extend the branch > predictor to predict the vtype/vl and use the "last" values, this should be > correct a huge percentage of the time. Smarter methods could also be > employed. > > While this may not be a particularly realistic way to implement a hardware > version of RVV and vset*vl*, I think that it's probably the best way to > model it in gem5 without creating a separate vector engine object which is > decoupled from the CPU model. > > We have been working on a proof-of-concept for this here at UC Davis (see > https://github.com/darchr/gem5/tree/hn/rvv-uop, though this is untested > in timing mode right now). Do you all think this is a good way forward? Or, > is there something that I'm missing about the decoder stalling? > > Cheers, > Jason >
_______________________________________________ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org