[fossil-dev] Proposed roadmap for Fossil 2.0

Richard Hipp Sun, 26 Feb 2017 12:43:39 -0800

This message is cross-posted to fossil-users and fossil-dev.
Follow-ups should go to fossil-dev only, please.  Thanks.


I propose that the next release of Fossil be called "Fossil 2.0", that
it occur before Easter (2017-04-16), and that it have the following
features:

(1) Fossil 2.0 is backwards compatible with Fossil 1.x.  Fossil 2.0
can push and pull from a Fossil 1.x server.  Fossil 2.0 can read and
write Fossil 1.x repositories, though only after having run "fossil
rebuild".  The upgrade path is to first overwrite the older fossil 1.x
executable with a new fossil 2.0 executable, then run "fossil all
rebuild".

(2) Artifacts can be identified via multiple hash algorithms.  The
initial implementation will support SHA1 and SHA3-228.  (For brevity,
SHA3-228 will hereafter be referred to as K228.)

(3) The low-level file formats
(https://www.fossil-scm.org/fossil/doc/trunk/www/fileformat.wiki) are
unchanged except that the artifact hashes are allowed to be longer
than 40 hex digits for alternative hash algorithms.  For K228, the
hashes are 56 hex digits long.  Other hash algorithms may be supported
in future releases as long as each hash algorithm has a unique hash
length, thus enabling Fossil to figure out which algorithm is being
used simply by looking at the length of the hash.

(4) All artifact hashes within a single well-formed structure artifact
must use the same algorithm.  This restriction does not apply to the
MD5 hash used by the R-card and the Z-card.

(5) Every repository will have a preferred hash algorithm.  The
preferred hash algorithm can be changed by running "fossil rebuild"
with appropriate options. The artifact hashes displayed in the web
interface and on command-line output will be computed using the
preferred hash algorithm.  This means that the displayed hash names
for legacy check-ins will change when the hash algorithm is changed.
However, references to the old hash values will still be correctly
resolved.

For example, the current tip of trunk in the Fossil self-hosting
repository is named using a SHA1 hash as:
ccdafa2a93e7bcefa1b4d0ea7474f9ce84c690f2.  If the hash algorithm is
changed to K228, then this check-in will afterwards be displayed as
3c658054301feb7e1cd25b66e32c94ffbf48d0b2f67310d33fb79a50.  However,
you will still be able to access the check-in using the
"https://www.fossil-scm.org/fossil/info/ccdafa2a93e7bcef"; URL and you
will still be able to update to that check-in by typing "fossil update
ccdafa2a".  In this way, a repository can transition from one hash
algorithm to another without breaking any legacy hyperlinks.

(6) Repositories can be configured to reject check-ins and other
structure artifacts that occur after a selected cut-off date and which
use the SHA1 hash algorithm.

(7) To implement the above, the BLOB.UUID field will be removed from
the repository database.  In its place, a new table will be added,
tentatively declared as follows:

     CREATE TABLE hname(
        hash TEXT,
        alg ANY,
        rid INTEGER REFERENCES blob(rid),
        aux ANY,
        PRIMARY KEY(hash,alg)
     ) WITHOUT ROWID;
     CREATE INDEX hname_rid ON hname(rid);

In Fossil 1.x, there was a 1-to-1 correspondence between hash values
and artifacts.  Since it supports multiple hash algorithms, Fossil 2.0
now has a many-to-one relationship between hash values and artifacts,
and so the hash values need to be stored in a separate table.  The
"alg" field will be a numeric 0 for the preferred hash, and some other
code (yet to be decided) for alternative hashes.  Note that this new
table can also store git-style artifact hashes which would facilitate
creating a Fossil-to-Git bridge that enables a Fossil server to
directly respond to push/pull requests from Git clients using the Git
wire protocol.  The "aux" field is included in anticipation of this
Fossil-to-Git bridge.  For now, the "aux" field will always be NULL.
This Fossil-to-Git bridge will not be available in the first release
but might be a feature added in subsequent releases.

I believe that most of the work in creating Fossil 2.0 will involve
going through the source code, locating queries that use BLOB.UUID,
and revising those queries to use the HNAME table instead.

Unknowns:

(8) Is it possible for two Fossil servers to sync if they are using
different preferred hash algorithms?   This is a desired goal, but I
do not yet understand how hard that will be.

(9) Can a Fossil 1.x client push/pull/clone from a Fossil 2.0 server,
assuming the repository uses SHA1 has it preferred hash algorithm?
This is desirable, but I am willing to sacrifice this capability in
order to reduce complexity.

(10) Should Keccak hashes that are not part of the SHA3 standard
(example: Keccak[196]) be supported?  K196 is desirable in that its
hash length is 48 bytes, only 8 bytes longer than SHA1.

Feedback is welcomed and encouraged, though let's keep the discussion
on fossil-dev and off of fossil-users if possible.  Thanks.
-- 
D. Richard Hipp
[email protected]
_______________________________________________
fossil-dev mailing list
[email protected]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/fossil-dev

[fossil-dev] Proposed roadmap for Fossil 2.0

Reply via email to