Andrea Bolognani <abolo...@redhat.com> writes: > On Tue, May 03, 2022 at 09:57:27AM +0200, Markus Armbruster wrote: >> Andrea Bolognani <abolo...@redhat.com> writes: >> > I still feel that 1) users of a language SDK will ideally not need to >> > look at the QAPI schema or wire chatter too often >> >> I think the most likely point of contact is the QEMU QMP Reference >> Manual. > > Note that there isn't anything preventing us from including the > original QAPI name in the documentation for the corresponding Go > symbol, or even a link to the reference manual. > > So we could make jumping from the Go API documentation, which is what > a Go programmer will be looking at most of the time, to the QMP > documentation pretty much effortless. > >> My point is: a name override feature like the one you propose needs to >> be used with discipline and restraint. Adds to reviewers' mental load. >> Needs to be worth it. I'm not saying it isn't, I'm just pointing out a >> cost. > > Yeah, I get that. > > Note that I'm not suggesting it should be possible for a name to be > completely overridden - I just want to make it possible for a human > to provide the name parsing algorithm solutions to those problems it > can't figure out on its own. > > We could prevent that feature from being misused by verifying that > the symbol the annotation is attached to can be derived from the list > of words provided. That way, something like > > # SOMEName (completely-DIFFERENT-name) > > would be rejected and we would avoid misuse.
Possibly as simple as "down-case both names and drop the funny characters, result must be the same". >> Wild idea: assume all lower case, but keep a list of exceptions. > > That could actually work reasonably well for QEMU because we only > need to handle correctly what's in the schema, not arbitrary input. > > There's always the risk of the list of exceptions getting out of sync > with the needs of the schema, but there's similarly no guarantee that > annotations are going to be introduced when they are necessary, so > it's mostly a wash. > > The only slight advantage of the annotation approach would be that it > might be easier to notice it being missing because it's close to the > name it refers to, while the list of exceptions is tucked away in a > script far away from it. We'd put it in qapi/pragma.json, I guess. >> The QAPI schema language uses three naming styles: >> >> * lower-case-with-hyphens for command and member names >> >> Many names use upper case and '_'. See pragma command-name-exceptions >> and member-name-exceptions. > > Looking at the output generated by Victor's WIP script, it looks like > these are already handled as nicely as those that don't fall under > any exception. > >> Some (many?) names lack separators between words (example: logappend). How many would be good to know. Ad hoc hackery to find names, filter out camels (because word splitting is too hard there), split into words, look up words in a word list: $ for i in `/usr/bin/python3 /work/armbru/qemu/scripts/qapi-gen.py -o qapi -b ../qapi/qapi-schema.json | sort -u | awk '/^### [a-z0-9-]+$/ { print "lc", $2; next } /^### [a-z0-9_-]+$/ { print lu; next } /^### [A-Z0-9_]+$/ { print "uc", $2; next } /^### ([A-Z][a-z]+)+/ { print "cc", $2; next } { print "mc", $2 }' | sed '/^mc\|^cc/d;s/^.. //;s/[^A-Za-z0-9]/\n/g' | tr A-Z a-z | sort -u`; do grep -q "^$i$" /usr/share/dict/words || echo $i; done 420 lines. How many arguably lack separators between words? Wild guess based on glancing at the output sideways: some 50. >> * UPPER_CASE_WITH_UNDERSCORE for event names >> >> * CamelCase for type names >> >> Capitalization of words is inconsistent in places (example: VncInfo >> vs. DisplayReloadOptionsVNC). >> >> What style conversions will we need for Go? Any other conversions come >> to mind? >> >> What problems do these conversions have? > > Go uses CamelCase for pretty much everything: types, methods, > constants... > > There's one slight wrinkle, in that the case of the first letter > decides whether it's going to be a PublicName or a privateName. We > can't do anything about that, but it shouldn't really affect us > that much because we'll want all QAPI names to be public. > > So the issues preventing us from producing a "perfect" Go API are > > 1. inconsistent capitalization in type names > > -> could be addressed by simply changing the schema, as type > names do not travel on the wire At the price of some churn in C code. Perhaps more consistent capitalization could be regarded as a slight improvement on its own. We need to see (a good sample of) the changes to judge. > 2. missing dashes in certain command/member names > > -> leads to Incorrectcamelcase. Names with words run together are arguably no uglier in CamelCase (Go) than in lower_case_with_underscores (C). > Kevin's work is supposed to > address this Except it's stuck. Perhaps Kevin and I can get it moving again. Perhaps we can try to extract a local alias feature that can be grown into the more ambitious aliases Kevin wants (if we can solve the issues). > 3. inability to know which parts of a lower-case-name or > UPPER_CASE_NAME are acronyms or are otherwise supposed to be > capitalized in a specific way > > -> leads to WeirdVncAndDbusCapitalization. There's currently no > way, either implemented or planned, to avoid this A list of words with special capitalization needs[*]? VNC is an acronym, some languagues want VNC in camels, some Vnc. DBus is an abbreviation, some languages want DBus in camels, some Dbus. > In addition to these I'm also thinking that QKeyCode and all the > QCrypto stuff should probably lose their prefixes. As Daniel pointed out, schema names sometimes have prefixes because we need the generated C identifiers to have prefixes. If we hate these prefixes enough, we can try to limit them to C identifiers. > Note that 3 shouldn't be an issue for Rust and addressing 1 would > actually make things worse for that language, because at the moment > at least *some* of the types follow its expected naming rules :) Solving Go problems by creating Rust problems doesn't feel like a good move to me. >> > Revised proposal for the annotation: >> > >> > ns:word-WORD-WoRD-123Word >> > >> > Words are always separated by dashes; "regular" words are entirely >> > lowercase, while the presence of even a single uppercase letter in a >> > word denotes the fact that its case should be preserved when the >> > naming conventions of the target language allow that. >> >> Is a word always capitalized the same for a single target language? Or >> could capitalization depend on context? > > I'm not aware of any language that would adopt more than a single > style of capitalization, outside of course the obvious > lower_case_name or UPPER_CASE_NAME scenarios where the original > capitalization stops being relevant. Makes sense. [*] Sounds like crony capitalism, doesn't it :)