Re: [fossil-dev] [fossil-users] Proposed roadmap for Fossil 2.0

Roy Keene Tue, 28 Feb 2017 10:20:01 -0800

Proposed change:

        1. A new control artifact ("Upgrade Security Hash Identifier");
                a. U cards which identify an existing SHA1 (or other hash)
                   and the same hash in a different hasing algorith, e.g.:
                        U SHA1 3301ac54c1e6072db792352f5a77b6defbba6d7f SHA256 
4b4781b51dd011ff6202b527a97ddb3c1b7b3cc8122ef42c2b8b1fbe3dd4a17e
                b. New hashing algorithm has to be stronger than old
                   hasing algorithm (or artifact is invalid)
                c. An ordered set of hashing algorithms must be present in
                   Fossil
                d. Once the control artifact is present, it represents a
                   "flag day" for that repository and all new artifacts
                   must be in the new hashing algorithm (until a new
                   control artifact is presented to upgrade it again).
                   More "USHI" control artifacts of the same upgrade type
                   will allow existing commits from repositories to be
                   present, as long as the control artifact is submitted
                   first.
                e. If a collision is submitted (e.g., same SHA1, different
                   SHA256) the artifact (by SHA1) is considered
                   compromised and shunned from the repository (or
                   something)
                f. The control artifact is identified by the new hashing
                   algorithm in the format below (#2)


        2. A new format for hashes, rather than relying strictly on size
           to associate, include a hashing algorithm identifier, as 2
           additional bytes at the end, ':' <hashID>   where the ID is the
           hashing mechanism identifier, from the ordered set above
           (#1.c.).  The 2 bytes (16 bits) would be encoded in hex along
           with the hash itself



On Sun, 26 Feb 2017, Richard Hipp wrote:

This message is cross-posted to fossil-users and fossil-dev.
Follow-ups should go to fossil-dev only, please.  Thanks.

I propose that the next release of Fossil be called "Fossil 2.0", that
it occur before Easter (2017-04-16), and that it have the following
features:

(1) Fossil 2.0 is backwards compatible with Fossil 1.x.  Fossil 2.0
can push and pull from a Fossil 1.x server.  Fossil 2.0 can read and
write Fossil 1.x repositories, though only after having run "fossil
rebuild".  The upgrade path is to first overwrite the older fossil 1.x
executable with a new fossil 2.0 executable, then run "fossil all
rebuild".

(2) Artifacts can be identified via multiple hash algorithms.  The
initial implementation will support SHA1 and SHA3-228.  (For brevity,
SHA3-228 will hereafter be referred to as K228.)

(3) The low-level file formats
(https://www.fossil-scm.org/fossil/doc/trunk/www/fileformat.wiki) are
unchanged except that the artifact hashes are allowed to be longer
than 40 hex digits for alternative hash algorithms.  For K228, the
hashes are 56 hex digits long.  Other hash algorithms may be supported
in future releases as long as each hash algorithm has a unique hash
length, thus enabling Fossil to figure out which algorithm is being
used simply by looking at the length of the hash.

(4) All artifact hashes within a single well-formed structure artifact
must use the same algorithm.  This restriction does not apply to the
MD5 hash used by the R-card and the Z-card.

(5) Every repository will have a preferred hash algorithm.  The
preferred hash algorithm can be changed by running "fossil rebuild"
with appropriate options. The artifact hashes displayed in the web
interface and on command-line output will be computed using the
preferred hash algorithm.  This means that the displayed hash names
for legacy check-ins will change when the hash algorithm is changed.
However, references to the old hash values will still be correctly
resolved.

For example, the current tip of trunk in the Fossil self-hosting
repository is named using a SHA1 hash as:
ccdafa2a93e7bcefa1b4d0ea7474f9ce84c690f2.  If the hash algorithm is
changed to K228, then this check-in will afterwards be displayed as
3c658054301feb7e1cd25b66e32c94ffbf48d0b2f67310d33fb79a50.  However,
you will still be able to access the check-in using the
"https://www.fossil-scm.org/fossil/info/ccdafa2a93e7bcef"; URL and you
will still be able to update to that check-in by typing "fossil update
ccdafa2a".  In this way, a repository can transition from one hash
algorithm to another without breaking any legacy hyperlinks.

(6) Repositories can be configured to reject check-ins and other
structure artifacts that occur after a selected cut-off date and which
use the SHA1 hash algorithm.

(7) To implement the above, the BLOB.UUID field will be removed from
the repository database.  In its place, a new table will be added,
tentatively declared as follows:

    CREATE TABLE hname(
       hash TEXT,
       alg ANY,
       rid INTEGER REFERENCES blob(rid),
       aux ANY,
       PRIMARY KEY(hash,alg)
    ) WITHOUT ROWID;
    CREATE INDEX hname_rid ON hname(rid);

In Fossil 1.x, there was a 1-to-1 correspondence between hash values
and artifacts.  Since it supports multiple hash algorithms, Fossil 2.0
now has a many-to-one relationship between hash values and artifacts,
and so the hash values need to be stored in a separate table.  The
"alg" field will be a numeric 0 for the preferred hash, and some other
code (yet to be decided) for alternative hashes.  Note that this new
table can also store git-style artifact hashes which would facilitate
creating a Fossil-to-Git bridge that enables a Fossil server to
directly respond to push/pull requests from Git clients using the Git
wire protocol.  The "aux" field is included in anticipation of this
Fossil-to-Git bridge.  For now, the "aux" field will always be NULL.
This Fossil-to-Git bridge will not be available in the first release
but might be a feature added in subsequent releases.

I believe that most of the work in creating Fossil 2.0 will involve
going through the source code, locating queries that use BLOB.UUID,
and revising those queries to use the HNAME table instead.

Unknowns:

(8) Is it possible for two Fossil servers to sync if they are using
different preferred hash algorithms?   This is a desired goal, but I
do not yet understand how hard that will be.

(9) Can a Fossil 1.x client push/pull/clone from a Fossil 2.0 server,
assuming the repository uses SHA1 has it preferred hash algorithm?
This is desirable, but I am willing to sacrifice this capability in
order to reduce complexity.

(10) Should Keccak hashes that are not part of the SHA3 standard
(example: Keccak[196]) be supported?  K196 is desirable in that its
hash length is 48 bytes, only 8 bytes longer than SHA1.

Feedback is welcomed and encouraged, though let's keep the discussion
on fossil-dev and off of fossil-users if possible.  Thanks.
--
D. Richard Hipp
d...@sqlite.org
_______________________________________________
fossil-users mailing list
fossil-us...@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users

_______________________________________________
fossil-dev mailing list
fossil-dev@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/fossil-dev

Re: [fossil-dev] [fossil-users] Proposed roadmap for Fossil 2.0

Reply via email to