Hi,
Thanks for writing this draft spec.
Please see my suggested changes below:
> On 17 Apr 2018, at 21:23, juga wrote:
>
> Hi,
>
> as commented with teor and pastly, i send in-line a draft specification
> for the document format that the bandwidth scanner implementations
> should produce.
>
> I've left my own questions/notes in square brackets.
>
> Thanks,
> juga.
>
> ===
>
> Tor Bandwidth Measurements Document Format
> [juga: which name should we give to this document?]
That's a fine name.
You can leave out the "Document" if you want.
> 1. Scope and preliminaries
>
> This document describes the format of Tor's bandwidth measurements
> document, version X.X.X [juga: which version should be this?]
It doesn't matter, so let's use semantic versioning:
* the original torflow format was 1.0.0
* the format in this spec adds the "version" feature, so it is 1.1.0
(it is compatible with 1.0.0, as long as parsers ignore unrecognised
lines)
> and later.
>
> Since Tor version X.X.X [juga: which tor version?]
It looks like 0.2.4.12-alpha added measured bandwidths
https://gitweb.torproject.org/tor.git/tree/ChangeLog#n12710
> the directory
> authorities use the bandwidth measurements document called
> "V3BandwidthsFile" and produced by Torflow [1]
> (format described in README.spec.txt [2]).
>
>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
>NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and
>"OPTIONAL" in this document are to be interpreted as described in
>RFC 2119.
>
> 1.2. Acknowledgements
>
> The original bandwidth measurement scanner (Torflow) and format was
> created by mike. Teor suggested to write this specification while
> contributing on pastly's new bandwidth scanner implementation.
>
> This specification was revised after feedback from:
>
>XXX
>
> 1.3 Outline
>
> The bandwidth measurements mentioned in sections 3.4.1 and 3.4.2
> of dir-spec.txt [3] are obtained by bandwidth authorities, which are
> either directory authorities or other servers running bandwidth
> measurement scanners and sending the results to the former.
> [juga: it seems that bandwidth authorities have not been formally
> before]
You could use the definition in the man page:
"the bandwidth-authority generated file storing information on
relays' measured bandwidth capacities"
> 2. Format details
>
> Bandwidth measurements MUST contain the following sections:
> - Header (exactly once)
> - Relays measurements (zero or more times)
>
> Each section (or entry) ends with a separator.
This line is a copy-paste error, it should be deleted.
> 2.1. Nonterminals
>
> The following nonterminals are defined in the Onionoo details
> document specification [4]:
>
>fingerprint
>nickname
This file format gets the fingerprint and nickname from the
consensus, so you should reference dir-spec.txt.
(dir-list-spec.txt gets relay fingerprints and nicknames from
Onionoo. That's why it uses the Onionoo definitions.)
Here are the definitions of hexdigest (fingerprint) and nickname:
https://gitweb.torproject.org/torspec.git/tree/dir-spec.txt#n1268
> In the bandwidth measurement documents nickname is optional.
"optional" is not relevant in a definition.
Let's delete this line, it's already documented as optional later on.
> The following nonterminals are defined in the in dir-spec.txt:
>
>NL (newline)
>SP (space)
>
>"bw" = INT, the aggregated measured bandwidth of this relay, in
>kilobytes per second.
bw is not defined in dir-spec.txt. And the formatting is confusing.
Double quotes are used for ASCII literal strings in dir-spec.txt.
Can you please follow the format used in dir-spec.txt?
Here is one example:
https://gitweb.torproject.org/torspec.git/tree/dir-spec.txt#n210
Here's how you can define bw using the Int definition from
dir-spec.txt:
https://gitweb.torproject.org/torspec.git/tree/dir-spec.txt#n795
bw = Int
bw is the aggregated measured bandwidth of this relay, in kilobytes
per second.
> We introduce the following nonterminals:
> [juga: this should probably be defined more formally and should
> probably link to other documents, which ones?]
dir-spec.txt
>"version" = The name and the version of the bandwidth scannner
>software, such as "sbws 0.1.0".
Our newest spec uses "version" for the file format version:
https://gitweb.torproject.org/torspec.git/tree/dir-list-spec.txt#n148
So please don't make a field with a different meaning and structure,
and call it "version".
I suggest:
* use "version" for the file format version (or don't use "version")
* use "source" for the implementation software name and version
Please fix the formatting of this definition to be like dir-spec.txt.
This definition has two arguments separated by spaces, the name,
and the version.
>The name of the software, if absent, is assumed to be "torflow".