I'm trying to understand the binary encoding as explained 
in https://developers.google.com/protocol-buffers/docs/encoding.

I have the following schema that models a GitHub user as per the GitHub API:

syntax = "proto3";

message GitHubUser {
  string login = 1;
  uint32 id = 2;
  string node_id = 3;
  string avatar_url = 4;
  string gravatar_id = 5;
  string url = 6;
  string html_url = 7;
  string followers_url = 8;
  string following_url = 9;
  string gists_url = 10;
  string starred_url = 11;
  string subscriptions_url = 12;
  string organizations_url = 13;
  string repos_url = 14;
  string events_url = 15;
  string received_events_url = 16;
  string type = 17;
  bool site_admin = 18;
}

And the following JSON document:

{
  "login": "octocat",
  "id": 1,
  "node_id": "MDQ6VXNlcjE=",
  "avatar_url": "https://github.com/images/error/octocat_happy.gif";,
  "gravatar_id": "",
  "url": "https://api.github.com/users/octocat";,
  "html_url": "https://github.com/octocat";,
  "followers_url": "https://api.github.com/users/octocat/followers";,
  "following_url": 
"https://api.github.com/users/octocat/following{/other_user}";,
  "gists_url": "https://api.github.com/users/octocat/gists{/gist_id}";,
  "starred_url": 
"https://api.github.com/users/octocat/starred{/owner}{/repo}";,
  "subscriptions_url": "https://api.github.com/users/octocat/subscriptions";,
  "organizations_url": "https://api.github.com/users/octocat/orgs";,
  "repos_url": "https://api.github.com/users/octocat/repos";,
  "events_url": "https://api.github.com/users/octocat/events{/privacy}";,
  "received_events_url": 
"https://api.github.com/users/octocat/received_events";,
  "type": "User",
  "site_admin": false
}

The produced payload using the Python library is the following:

00000000: 0a07 6f63 746f 6361 7410 011a 0c4d 4451  ..octocat....MDQ
00000010: 3656 584e 6c63 6a45 3d22 3168 7474 7073  6VXNlcjE="1https
00000020: 3a2f 2f67 6974 6875 622e 636f 6d2f 696d  ://github.com/im
00000030: 6167 6573 2f65 7272 6f72 2f6f 6374 6f63  ages/error/octoc
00000040: 6174 5f68 6170 7079 2e67 6966 3224 6874  at_happy.gif2$ht
00000050: 7470 733a 2f2f 6170 692e 6769 7468 7562  tps://api.github
00000060: 2e63 6f6d 2f75 7365 7273 2f6f 6374 6f63  .com/users/octoc
00000070: 6174 3a1a 6874 7470 733a 2f2f 6769 7468  at:.https://gith
00000080: 7562 2e63 6f6d 2f6f 6374 6f63 6174 422e  ub.com/octocatB.
00000090: 6874 7470 733a 2f2f 6170 692e 6769 7468  https://api.gith
000000a0: 7562 2e63 6f6d 2f75 7365 7273 2f6f 6374  ub.com/users/oct
000000b0: 6f63 6174 2f66 6f6c 6c6f 7765 7273 4a3b  ocat/followersJ;
000000c0: 6874 7470 733a 2f2f 6170 692e 6769 7468  https://api.gith
000000d0: 7562 2e63 6f6d 2f75 7365 7273 2f6f 6374  ub.com/users/oct
000000e0: 6f63 6174 2f66 6f6c 6c6f 7769 6e67 7b2f  ocat/following{/
000000f0: 6f74 6865 725f 7573 6572 7d52 3468 7474  other_user}R4htt
00000100: 7073 3a2f 2f61 7069 2e67 6974 6875 622e  ps://api.github.
00000110: 636f 6d2f 7573 6572 732f 6f63 746f 6361  com/users/octoca
00000120: 742f 6769 7374 737b 2f67 6973 745f 6964  t/gists{/gist_id
00000130: 7d5a 3b68 7474 7073 3a2f 2f61 7069 2e67  }Z;https://api.g
00000140: 6974 6875 622e 636f 6d2f 7573 6572 732f  ithub.com/users/
00000150: 6f63 746f 6361 742f 7374 6172 7265 647b  octocat/starred{
00000160: 2f6f 776e 6572 7d7b 2f72 6570 6f7d 6232  /owner}{/repo}b2
00000170: 6874 7470 733a 2f2f 6170 692e 6769 7468  https://api.gith
00000180: 7562 2e63 6f6d 2f75 7365 7273 2f6f 6374  ub.com/users/oct
00000190: 6f63 6174 2f73 7562 7363 7269 7074 696f  ocat/subscriptio
000001a0: 6e73 6a29 6874 7470 733a 2f2f 6170 692e  nsj)https://api.
000001b0: 6769 7468 7562 2e63 6f6d 2f75 7365 7273  github.com/users
000001c0: 2f6f 6374 6f63 6174 2f6f 7267 7372 2a68  /octocat/orgsr*h
000001d0: 7474 7073 3a2f 2f61 7069 2e67 6974 6875  ttps://api.githu
000001e0: 622e 636f 6d2f 7573 6572 732f 6f63 746f  b.com/users/octo
000001f0: 6361 742f 7265 706f 737a 3568 7474 7073  cat/reposz5https
00000200: 3a2f 2f61 7069 2e67 6974 6875 622e 636f  ://api.github.co
00000210: 6d2f 7573 6572 732f 6f63 746f 6361 742f  m/users/octocat/
00000220: 6576 656e 7473 7b2f 7072 6976 6163 797d  events{/privacy}
00000230: 82*01* 3468 7474 7073 3a2f 2f61 7069 2e67  ..4https://api.g
00000240: 6974 6875 622e 636f 6d2f 7573 6572 732f  ithub.com/users/
00000250: 6f63 746f 6361 742f 7265 6365 6976 6564  octocat/received
00000260: 5f65 7665 6e74 738a* 01*04 5573 6572       _events...User

Notice that the "type" string property in the schema has field number 17. 
So the key is "(17 << 3) | 2", which equals "8a". The value length is 4 
characters.

I would expect "type" to be encoded as: 

8a 04 5573 6572

However it is being encoded as:

8a 0104 5573 6572

Where is the "01" after the key coming from? What does it mean? I can't 
find any documentation about it. Is it just padding? If so, for what 
reason? I see the same "01" in the field 16, but its not present in any 
other field.
 


-- 
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/protobuf/a02e6914-2340-453b-ba6c-445f4fecb84en%40googlegroups.com.

Reply via email to