I'm trying to understand the binary encoding as explained 
in https://developers.google.com/protocol-buffers/docs/encoding.

I have the following schema that models a GitHub user as per the GitHub API:

syntax = "proto3";

message GitHubUser {
  string login = 1;
  uint32 id = 2;
  string node_id = 3;
  string avatar_url = 4;
  string gravatar_id = 5;
  string url = 6;
  string html_url = 7;
  string followers_url = 8;
  string following_url = 9;
  string gists_url = 10;
  string starred_url = 11;
  string subscriptions_url = 12;
  string organizations_url = 13;
  string repos_url = 14;
  string events_url = 15;
  string received_events_url = 16;
  string type = 17;
  bool site_admin = 18;
}

And the following JSON document:

{
  "login": "octocat",
  "id": 1,
  "node_id": "MDQ6VXNlcjE=",
  "avatar_url": "https://github.com/images/error/octocat_happy.gif";,
  "gravatar_id": "",
  "url": "https://api.github.com/users/octocat";,
  "html_url": "https://github.com/octocat";,
  "followers_url": "https://api.github.com/users/octocat/followers";,
  "following_url": 
"https://api.github.com/users/octocat/following{/other_user}";,
  "gists_url": "https://api.github.com/users/octocat/gists{/gist_id}";,
  "starred_url": 
"https://api.github.com/users/octocat/starred{/owner}{/repo}";,
  "subscriptions_url": "https://api.github.com/users/octocat/subscriptions";,
  "organizations_url": "https://api.github.com/users/octocat/orgs";,
  "repos_url": "https://api.github.com/users/octocat/repos";,
  "events_url": "https://api.github.com/users/octocat/events{/privacy}";,
  "received_events_url": 
"https://api.github.com/users/octocat/received_events";,
  "type": "User",
  "site_admin": false
}

The produced payload using the Python library is the following:

00000000: 0a07 6f63 746f 6361 7410 011a 0c4d 4451  ..octocat....MDQ
00000010: 3656 584e 6c63 6a45 3d22 3168 7474 7073  6VXNlcjE="1https
00000020: 3a2f 2f67 6974 6875 622e 636f 6d2f 696d  ://github.com/im
00000030: 6167 6573 2f65 7272 6f72 2f6f 6374 6f63  ages/error/octoc
00000040: 6174 5f68 6170 7079 2e67 6966 3224 6874  at_happy.gif2$ht
00000050: 7470 733a 2f2f 6170 692e 6769 7468 7562  tps://api.github
00000060: 2e63 6f6d 2f75 7365 7273 2f6f 6374 6f63  .com/users/octoc
00000070: 6174 3a1a 6874 7470 733a 2f2f 6769 7468  at:.https://gith
00000080: 7562 2e63 6f6d 2f6f 6374 6f63 6174 422e  ub.com/octocatB.
00000090: 6874 7470 733a 2f2f 6170 692e 6769 7468  https://api.gith
000000a0: 7562 2e63 6f6d 2f75 7365 7273 2f6f 6374  ub.com/users/oct
000000b0: 6f63 6174 2f66 6f6c 6c6f 7765 7273 4a3b  ocat/followersJ;
000000c0: 6874 7470 733a 2f2f 6170 692e 6769 7468  https://api.gith
000000d0: 7562 2e63 6f6d 2f75 7365 7273 2f6f 6374  ub.com/users/oct
000000e0: 6f63 6174 2f66 6f6c 6c6f 7769 6e67 7b2f  ocat/following{/
000000f0: 6f74 6865 725f 7573 6572 7d52 3468 7474  other_user}R4htt
00000100: 7073 3a2f 2f61 7069 2e67 6974 6875 622e  ps://api.github.
00000110: 636f 6d2f 7573 6572 732f 6f63 746f 6361  com/users/octoca
00000120: 742f 6769 7374 737b 2f67 6973 745f 6964  t/gists{/gist_id
00000130: 7d5a 3b68 7474 7073 3a2f 2f61 7069 2e67  }Z;https://api.g
00000140: 6974 6875 622e 636f 6d2f 7573 6572 732f  ithub.com/users/
00000150: 6f63 746f 6361 742f 7374 6172 7265 647b  octocat/starred{
00000160: 2f6f 776e 6572 7d7b 2f72 6570 6f7d 6232  /owner}{/repo}b2
00000170: 6874 7470 733a 2f2f 6170 692e 6769 7468  https://api.gith
00000180: 7562 2e63 6f6d 2f75 7365 7273 2f6f 6374  ub.com/users/oct
00000190: 6f63 6174 2f73 7562 7363 7269 7074 696f  ocat/subscriptio
000001a0: 6e73 6a29 6874 7470 733a 2f2f 6170 692e  nsj)https://api.
000001b0: 6769 7468 7562 2e63 6f6d 2f75 7365 7273  github.com/users
000001c0: 2f6f 6374 6f63 6174 2f6f 7267 7372 2a68  /octocat/orgsr*h
000001d0: 7474 7073 3a2f 2f61 7069 2e67 6974 6875  ttps://api.githu
000001e0: 622e 636f 6d2f 7573 6572 732f 6f63 746f  b.com/users/octo
000001f0: 6361 742f 7265 706f 737a 3568 7474 7073  cat/reposz5https
00000200: 3a2f 2f61 7069 2e67 6974 6875 622e 636f  ://api.github.co
00000210: 6d2f 7573 6572 732f 6f63 746f 6361 742f  m/users/octocat/
00000220: 6576 656e 7473 7b2f 7072 6976 6163 797d  events{/privacy}
00000230: 82*01* 3468 7474 7073 3a2f 2f61 7069 2e67  ..4https://api.g
00000240: 6974 6875 622e 636f 6d2f 7573 6572 732f  ithub.com/users/
00000250: 6f63 746f 6361 742f 7265 6365 6976 6564  octocat/received
00000260: 5f65 7665 6e74 738a* 01*04 5573 6572       _events...User

Notice that the "type" string property in the schema has field number 17. 
So the key is "(17 << 3) | 2", which equals "8a". The value length is 4 
characters.

I would expect "type" to be encoded as: 

8a 04 5573 6572

However it is being encoded as:

8a 0104 5573 6572

Where is the "01" after the key coming from? What does it mean? I can't 
find any documentation about it. Is it just padding? If so, for what 
reason? I see the same "01" in the field 16, but its not present in any 
other field.
 


-- 
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to protobuf+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/protobuf/a02e6914-2340-453b-ba6c-445f4fecb84en%40googlegroups.com.

Reply via email to