On Wed, Jun 10, 2026 at 05:56:25PM +0200, Nam Cao wrote: > Charlie Jenkins via B4 Relay > <[email protected]> writes: > > From: Charlie Jenkins <[email protected]> > > > > Eliminate the need to hand-write riscv instructions by using a shell > > script to autogenerate a header from an instruction table. This is modeled > > after the syscall table infrastructure. > > > > The table is generated externally by riscv-unified-db [1], but is > > in a simple format to make it possible to use other tools or modify > > manually. > > > > [1] https://github.com/riscv-software-src/riscv-unified-db > > > > Signed-off-by: Charlie Jenkins <[email protected]> > > Thanks for the work, I really like the idea. This will make it much > easier to maintain the instruction stuffs. > > > +c.ld common,32 011<13|00<0 imm<3=6-5|12-10 xd!1!3!5!7=4-2 xs1=9-7 > > +c.ld common,64 011<13|00<0 imm<3=6-5|12-10 xd=4-2 xs1=9-7 > > Not sure if I confuse something, but the spec says "C.LD is an > RV64C-only instruction". Why do we have 32 here?
This is a weird one. The Ziclsd extension introduces it for RV32[1]. All of the data is generated from the riscv-unified-db and because it is in the Ziclsd extension, c.ld is included for 32-bit in the c.ld description [2]. [1] https://docs.riscv.org/reference/isa/extensions/zilsd/_attachments/riscv-zilsd.pdf [2] https://github.com/riscv/riscv-unified-db/blob/main/spec/std/isa/inst/C/c.ld.yaml > > > +echo "#define COMMA ," >> $outfile > > +echo "#define SEMICOLON ;" >> $outfile > > +echo "#define SINGLE_ARG(...) __VA_ARGS__" >> $outfile > > Aren't these macro unused? Yes thanks, I had them for an earlier version and never removed them. > > > +echo >> $outfile > > + > > +grep -E "^[a-z\.0-9]+[[:space:]]+" "$infile" | { > > + while read name base fixed variables; do > > + echo "/* $name */" > > + > > + compressed_name=${name##c.*} > ^^^^^^^^^^^^^^^ > this name is misleading That's fair, I can rename it to be something like "compressed_inst"? > > > + invalid_inst_functions="" > > + variable_params="" > > + constraints="" > > + match="" > > + mask="" > > + make="" > > + > > + # All compressed instructions start with "c." > > + size=${compressed_name:+32}; > > + size=${size:-16}; > > + > > + # Replace all . with _ > > + formatted_inst_name=$name > > + while [ ! ${formatted_inst_name##*.*} ]; do > > + prefix=${formatted_inst_name%.*} > > + suffix=${formatted_inst_name##*.} > > + contains_dot=${formatted_inst_name##*.*} > > + formatted_inst_name=${contains_dot:-${prefix}_${suffix}} > > + done > > Does the simplier > formatted_inst_name=$(echo $name | tr '.' '_') > work? That does work, but it dramatically slows down the time. I was trying to avoid using external programs because this is called on every compilation and there are a lot of instructions to parse. On my system, it's about 10x slower to use echo/tr. Taking the time from about 150us to 1.5ms for each iteration and the total time from around 0.8s to around 3.5s. > > > + echo "static __always_inline ${type}${size} > > riscv_insn_${formatted_inst_name}_extract_${variable_name}(u${size} > > ${insn})" > > + echo "{" > > + echo "\treturn ${extract};" > > + echo "}" > > + echo "static __always_inline void > > riscv_insn_${formatted_inst_name}_insert_${variable_name}(u${size} > > *${insn}, ${type}32 ${var})" > > + echo "{" > > + echo "\t*_insn &= ${insert_mask# & };" > > Why is this required? Isn't this part always zero at this point? > > > + echo "\t*_insn |= ${insert# | };" > > + echo "}" > > + > > + if [ "${only_base}" ]; then > > + invalid_inst_functions="${invalid_inst_functions}static > > __always_inline ${type}${size} > > riscv_insn_${formatted_inst_name}_extract_${variable_name}(u${size} > > ${insn}) {\n\tpanic(\"${name} is not supported on non ${only_base}-bit > > systems.\");\n}\n" > > Instead of panic(), can we do BUILD_BUG() instead? That's a better solution :) - Charlie > > Nam

