What if the field number were 32? (32 << 3) | 2 = 0x102, which won't fit in a byte. The field number is varint-encoded, and since the high bit is set with 16+ (i.e. 0x80), the tag id ends up taking 2 (or more) bytes -- with varint encoding, if the high bit is set, that means the remaining bits are in the next byte. If space is a big concern, use tag ids under 16 for the most-often used data.
Cheers, -ilia On Wed, Sep 30, 2020 at 5:04 PM Juan Cruz Viotti <[email protected]> wrote: > > I'm trying to understand the binary encoding as explained in > https://developers.google.com/protocol-buffers/docs/encoding. > > I have the following schema that models a GitHub user as per the GitHub API: > > syntax = "proto3"; > > message GitHubUser { > string login = 1; > uint32 id = 2; > string node_id = 3; > string avatar_url = 4; > string gravatar_id = 5; > string url = 6; > string html_url = 7; > string followers_url = 8; > string following_url = 9; > string gists_url = 10; > string starred_url = 11; > string subscriptions_url = 12; > string organizations_url = 13; > string repos_url = 14; > string events_url = 15; > string received_events_url = 16; > string type = 17; > bool site_admin = 18; > } > > And the following JSON document: > > { > "login": "octocat", > "id": 1, > "node_id": "MDQ6VXNlcjE=", > "avatar_url": "https://github.com/images/error/octocat_happy.gif", > "gravatar_id": "", > "url": "https://api.github.com/users/octocat", > "html_url": "https://github.com/octocat", > "followers_url": "https://api.github.com/users/octocat/followers", > "following_url": > "https://api.github.com/users/octocat/following{/other_user}", > "gists_url": "https://api.github.com/users/octocat/gists{/gist_id}", > "starred_url": > "https://api.github.com/users/octocat/starred{/owner}{/repo}", > "subscriptions_url": "https://api.github.com/users/octocat/subscriptions", > "organizations_url": "https://api.github.com/users/octocat/orgs", > "repos_url": "https://api.github.com/users/octocat/repos", > "events_url": "https://api.github.com/users/octocat/events{/privacy}", > "received_events_url": > "https://api.github.com/users/octocat/received_events", > "type": "User", > "site_admin": false > } > > The produced payload using the Python library is the following: > > 00000000: 0a07 6f63 746f 6361 7410 011a 0c4d 4451 ..octocat....MDQ > 00000010: 3656 584e 6c63 6a45 3d22 3168 7474 7073 6VXNlcjE="1https > 00000020: 3a2f 2f67 6974 6875 622e 636f 6d2f 696d ://github.com/im > 00000030: 6167 6573 2f65 7272 6f72 2f6f 6374 6f63 ages/error/octoc > 00000040: 6174 5f68 6170 7079 2e67 6966 3224 6874 at_happy.gif2$ht > 00000050: 7470 733a 2f2f 6170 692e 6769 7468 7562 tps://api.github > 00000060: 2e63 6f6d 2f75 7365 7273 2f6f 6374 6f63 .com/users/octoc > 00000070: 6174 3a1a 6874 7470 733a 2f2f 6769 7468 at:.https://gith > 00000080: 7562 2e63 6f6d 2f6f 6374 6f63 6174 422e ub.com/octocatB. > 00000090: 6874 7470 733a 2f2f 6170 692e 6769 7468 https://api.gith > 000000a0: 7562 2e63 6f6d 2f75 7365 7273 2f6f 6374 ub.com/users/oct > 000000b0: 6f63 6174 2f66 6f6c 6c6f 7765 7273 4a3b ocat/followersJ; > 000000c0: 6874 7470 733a 2f2f 6170 692e 6769 7468 https://api.gith > 000000d0: 7562 2e63 6f6d 2f75 7365 7273 2f6f 6374 ub.com/users/oct > 000000e0: 6f63 6174 2f66 6f6c 6c6f 7769 6e67 7b2f ocat/following{/ > 000000f0: 6f74 6865 725f 7573 6572 7d52 3468 7474 other_user}R4htt > 00000100: 7073 3a2f 2f61 7069 2e67 6974 6875 622e ps://api.github. > 00000110: 636f 6d2f 7573 6572 732f 6f63 746f 6361 com/users/octoca > 00000120: 742f 6769 7374 737b 2f67 6973 745f 6964 t/gists{/gist_id > 00000130: 7d5a 3b68 7474 7073 3a2f 2f61 7069 2e67 }Z;https://api.g > 00000140: 6974 6875 622e 636f 6d2f 7573 6572 732f ithub.com/users/ > 00000150: 6f63 746f 6361 742f 7374 6172 7265 647b octocat/starred{ > 00000160: 2f6f 776e 6572 7d7b 2f72 6570 6f7d 6232 /owner}{/repo}b2 > 00000170: 6874 7470 733a 2f2f 6170 692e 6769 7468 https://api.gith > 00000180: 7562 2e63 6f6d 2f75 7365 7273 2f6f 6374 ub.com/users/oct > 00000190: 6f63 6174 2f73 7562 7363 7269 7074 696f ocat/subscriptio > 000001a0: 6e73 6a29 6874 7470 733a 2f2f 6170 692e nsj)https://api. > 000001b0: 6769 7468 7562 2e63 6f6d 2f75 7365 7273 github.com/users > 000001c0: 2f6f 6374 6f63 6174 2f6f 7267 7372 2a68 /octocat/orgsr*h > 000001d0: 7474 7073 3a2f 2f61 7069 2e67 6974 6875 ttps://api.githu > 000001e0: 622e 636f 6d2f 7573 6572 732f 6f63 746f b.com/users/octo > 000001f0: 6361 742f 7265 706f 737a 3568 7474 7073 cat/reposz5https > 00000200: 3a2f 2f61 7069 2e67 6974 6875 622e 636f ://api.github.co > 00000210: 6d2f 7573 6572 732f 6f63 746f 6361 742f m/users/octocat/ > 00000220: 6576 656e 7473 7b2f 7072 6976 6163 797d events{/privacy} > 00000230: 8201 3468 7474 7073 3a2f 2f61 7069 2e67 ..4https://api.g > 00000240: 6974 6875 622e 636f 6d2f 7573 6572 732f ithub.com/users/ > 00000250: 6f63 746f 6361 742f 7265 6365 6976 6564 octocat/received > 00000260: 5f65 7665 6e74 738a 0104 5573 6572 _events...User > > Notice that the "type" string property in the schema has field number 17. So > the key is "(17 << 3) | 2", which equals "8a". The value length is 4 > characters. > > I would expect "type" to be encoded as: > > 8a 04 5573 6572 > > However it is being encoded as: > > 8a 0104 5573 6572 > > Where is the "01" after the key coming from? What does it mean? I can't find > any documentation about it. Is it just padding? If so, for what reason? I see > the same "01" in the field 16, but its not present in any other field. > > > > -- > You received this message because you are subscribed to the Google Groups > "Protocol Buffers" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/protobuf/a02e6914-2340-453b-ba6c-445f4fecb84en%40googlegroups.com. -- You received this message because you are subscribed to the Google Groups "Protocol Buffers" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/protobuf/CAKb7Uvj6nPX4jK%3DWqy6eG9_dVXCJmZxU6dOP6X4jo_tBC687xQ%40mail.gmail.com.
