On Mon, Nov 12, 2018 at 12:46 PM Michael Powell <[email protected]> wrote: > > On Mon, Nov 12, 2018 at 10:06 AM Michael Powell <[email protected]> wrote: > > > > Hello, > > > > Another question following up, how about the sign character for hex > > and oct integers? Is it necessary, should it be discarded? > > > > intLit = decimalLit | octalLit | hexLit > > decimalLit = ( "1" … "9" ) { decimalDigit } > > octalLit = "0" { octalDigit } > > hexLit = "0" ( "x" | "X" ) hexDigit { hexDigit } > > > > constant = fullIdent | ( [ "-" | "+" ] intLit ) | ( [ "-" | "+" ] > > floatLit ) | strLit | boolLit > > > > https://developers.google.com/protocol-buffers/docs/reference/proto2-spec#integer_literals > > https://developers.google.com/protocol-buffers/docs/reference/proto2-spec#constant > > > > For instance, I am fairly certain the sign character is encoded in a > > hex encoded integer. Not sure about octal, but I imagine that it is > > fairly consistent.
Got it sorted out I believe. Actually, it's quite nice the parser support Spirit provides, aligns pretty much perfectly with the grammar specification. There's a bit of gymnastics involved juggling whether the AST has a sign or not and so forth, but other than that, it flows well enough. > > Case in point, the value 107026150751750362 gets encoded as > > 0X17C3BB7913C48DA (upper-case). Whereas it's negative counterpart, > > -107026150751750362, really does get encoded as 0xFE83C4486EC3B726. > > Signage included, if memory serves. > > > > In these cases, I think the sign bit falls in the "optional" category? > > So... As far as I can determine, there are a couple of ways to > interpret this, semantically speaking. But this potentially informs > whatever parsing stack you are using as well. > > I'm using Boost Spirit Qi, for instance, which supports radix-based > integer parsing well enough, but has its own set of issues when > dealing with signage. That being said... > > 1. Treat the value itself as positive one way or another, with an > optional sign attribute (i.e. '+' or '-'). This would potentially > work, especially when there is base 16 (hex) or base 8 (octal) > involved. > > 2. Otherwise, open to suggestions, but for Qi constraints; that I know > of, fails to parse negative signed hexadecimal/octal encoded values. > > Again, kind of a symptom of an imprecise grammar specification. I can > get a sense for how to handle it, but does it truly capture "intent". > > Thanks in advance for any light that can be shed. > > > Cheers, thanks, > > > > Michael > > On Sun, Nov 11, 2018 at 10:56 AM Josh Humphries <[email protected]> > > wrote: > > > > > > For the case of zero by itself, per the spec, it will be parsed as an > > > octal literal with value zero -- so functionally equivalent to a decimal > > > literal with value zero. And for values with multiple digits, a leading > > > zero means it is an octal literal. Decimal values will not have a leading > > > zero. > > > > > > ---- > > > Josh Humphries > > > [email protected] > > > > > > > > > On Sat, Nov 10, 2018 at 10:16 PM Michael Powell <[email protected]> > > > wrote: > > >> > > >> Hello, > > >> > > >> I think 0 can be a decimal-lit, don't you think? However, the spec > > >> reads as follows: > > >> > > >> intLit = decimalLit | octalLit | hexLit > > >> decimalLit = ( "1" … "9" ) { decimalDigit } > > >> octalLit = "0" { octalDigit } > > >> hexLit = "0" ( "x" | "X" ) hexDigit { hexDigit } > > >> > > >> Is there a reason, semantically speaking, why decimal must be greater > > >> than 0? And that's not including a plus/minus sign when you factor in > > >> constants. > > >> > > >> Of course, parsing, order matters, similar as with the escape > > >> character phrases in the string-literal: > > >> > > >> hex-lit | oct-lit | dec-lit > > >> > > >> And so on, since you have to rule out 0x\d+ for hex, followed by 0\d* ... > > >> > > >> Actually, now that I look at it "0" (really, "decimal" 0) is lurking > > >> in the oct-lit phrase. > > >> > > >> Kind of a grammatical nit-pick, I know, but I just wanted to be clear > > >> here. Seems like a possible source of confusion if you aren't paying > > >> careful attention. > > >> > > >> Thoughts? > > >> > > >> Best regards, > > >> > > >> Michael Powell > > >> > > >> -- > > >> You received this message because you are subscribed to the Google > > >> Groups "Protocol Buffers" group. > > >> To unsubscribe from this group and stop receiving emails from it, send > > >> an email to [email protected]. > > >> To post to this group, send email to [email protected]. > > >> Visit this group at https://groups.google.com/group/protobuf. > > >> For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups "Protocol Buffers" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/protobuf. For more options, visit https://groups.google.com/d/optout.
