Re: [protobuf] Spec v2 int-lit snafu?

Michael Powell Mon, 12 Nov 2018 09:47:20 -0800

On Mon, Nov 12, 2018 at 10:06 AM Michael Powell <[email protected]> wrote:
>
> Hello,
>
> Another question following up, how about the sign character for hex
> and oct integers? Is it necessary, should it be discarded?
>
> intLit     = decimalLit | octalLit | hexLit
> decimalLit = ( "1" … "9" ) { decimalDigit }
> octalLit   = "0" { octalDigit }
> hexLit = "0" ( "x" | "X" ) hexDigit { hexDigit }
>
> constant = fullIdent | ( [ "-" | "+" ] intLit ) | ( [ "-" | "+" ]
> floatLit ) | strLit | boolLit
>
> https://developers.google.com/protocol-buffers/docs/reference/proto2-spec#integer_literals
> https://developers.google.com/protocol-buffers/docs/reference/proto2-spec#constant
>
> For instance, I am fairly certain the sign character is encoded in a
> hex encoded integer. Not sure about octal, but I imagine that it is
> fairly consistent.
>
> Case in point, the value 107026150751750362 gets encoded as
> 0X17C3BB7913C48DA (upper-case). Whereas it's negative counterpart,
> -107026150751750362, really does get encoded as 0xFE83C4486EC3B726.
> Signage included, if memory serves.
>
> In these cases, I think the sign bit falls in the "optional" category?


So... As far as I can determine, there are a couple of ways to
interpret this, semantically speaking. But this potentially informs
whatever parsing stack you are using as well.

I'm using Boost Spirit Qi, for instance, which supports radix-based
integer parsing well enough, but has its own set of issues when
dealing with signage. That being said...

1. Treat the value itself as positive one way or another, with an
optional sign attribute (i.e. '+' or '-'). This would potentially
work, especially when there is base 16 (hex) or base 8 (octal)
involved.

2. Otherwise, open to suggestions, but for Qi constraints; that I know
of, fails to parse negative signed hexadecimal/octal encoded values.

Again, kind of a symptom of an imprecise grammar specification. I can
get a sense for how to handle it, but does it truly capture "intent".

Thanks in advance for any light that can be shed.

> Cheers, thanks,
>
> Michael
> On Sun, Nov 11, 2018 at 10:56 AM Josh Humphries <[email protected]> wrote:
> >
> > For the case of zero by itself, per the spec, it will be parsed as an octal 
> > literal with value zero -- so functionally equivalent to a decimal literal 
> > with value zero. And for values with multiple digits, a leading zero means 
> > it is an octal literal. Decimal values will not have a leading zero.
> >
> > ----
> > Josh Humphries
> > [email protected]
> >
> >
> > On Sat, Nov 10, 2018 at 10:16 PM Michael Powell <[email protected]> 
> > wrote:
> >>
> >> Hello,
> >>
> >> I think 0 can be a decimal-lit, don't you think? However, the spec
> >> reads as follows:
> >>
> >> intLit     = decimalLit | octalLit | hexLit
> >> decimalLit = ( "1" … "9" ) { decimalDigit }
> >> octalLit   = "0" { octalDigit }
> >> hexLit     = "0" ( "x" | "X" ) hexDigit { hexDigit }
> >>
> >> Is there a reason, semantically speaking, why decimal must be greater
> >> than 0? And that's not including a plus/minus sign when you factor in
> >> constants.
> >>
> >> Of course, parsing, order matters, similar as with the escape
> >> character phrases in the string-literal:
> >>
> >> hex-lit | oct-lit | dec-lit
> >>
> >> And so on, since you have to rule out 0x\d+ for hex, followed by 0\d* ...
> >>
> >> Actually, now that I look at it "0" (really, "decimal" 0) is lurking
> >> in the oct-lit phrase.
> >>
> >> Kind of a grammatical nit-pick, I know, but I just wanted to be clear
> >> here. Seems like a possible source of confusion if you aren't paying
> >> careful attention.
> >>
> >> Thoughts?
> >>
> >> Best regards,
> >>
> >> Michael Powell
> >>
> >> --
> >> You received this message because you are subscribed to the Google Groups 
> >> "Protocol Buffers" group.
> >> To unsubscribe from this group and stop receiving emails from it, send an 
> >> email to [email protected].
> >> To post to this group, send email to [email protected].
> >> Visit this group at https://groups.google.com/group/protobuf.
> >> For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/protobuf.
For more options, visit https://groups.google.com/d/optout.

Re: [protobuf] Spec v2 int-lit snafu?

Reply via email to