frantisek holop
Fri, 03 Feb 2012 17:26:14 -0800
hi there, reading through the documentation i thought some sentences would be easier to read with some minor changes, and i also removed end of line whitespace.
the other item from the subject concerns the future change of '_FOSSIL_' to '.fos'. i am totally new on the list, so i am not familiar with the debate (if any happened) regarding this change, but the linguistic issue is that 'fos' in hungarian means a certain type of excrement.. i dont know if this is open to discussion, but if it were, i'd say '.fossil' would not be a good choice, as it looks like a repository without a name. '.fockout' would have it's own problems in english ;} so seeing that it's also part of fossil, also an sqlite db file, why not have something less mystical, bit more verbose (being hidden anyway), like '.checkout.fossil' or some such? in the worst case please keep _FOSSIL_ around for us hungarians :] -f Index: www/tech_overview.wiki ================================================================== --- www/tech_overview.wiki +++ www/tech_overview.wiki @@ -6,11 +6,11 @@ <h2>1.0 Introduction</h2> At its lowest level, a Fossil repository consists of an unordered set of immutable "artifacts". You might think of these artifacts as "files", since in many cases the artifacts exactly correspond to source code files -that are stored in the Fossil repository. But other "control artifacts" +that are stored in the Fossil repository. But other "control artifacts" are also included in the mix. These control artifacts define the relationships between artifacts - which files go together to form a particular version of the project, who checked in that version and when, what was the check-in comment, what wiki pages are included with the project, what are the edit histories of each wiki page, what bug reports or tickets are @@ -17,29 +17,27 @@ included, who contributed to the evolution of each ticket, and so forth, and so on. This low-level file format is called the "global state" of the repository, since this is the information that is synced to peer repositories using push and pull operations. The low-level file format is also called "enduring" since it is intended to last for many years. -The details of the low-level, enduring, global file format +The details of the low-level, enduring, global file format are [./fileformat.wiki | described separately]. This article is about how Fossil is currently implemented. Instead of dealing with vague abstractions of "enduring file formats" as the -[./fileformat.wiki | that other document] does, this article provides -some detail on how Fossil actually stores information on disk. +[./fileformat.wiki | other document] does, this article provides +some detail on how Fossil actually stores information on disk. <h2>2.0 Three Databases</h2> -Fossil stores state information in +Fossil stores state information in [http://www.sqlite.org/ | SQLite] database files. SQLite keeps an entire relational database, including multiple tables and indices, in a single disk file. The SQLite library allows the database files to be efficiently queried and updated using the industry-standard -SQL language. And SQLite makes updates to these database files atomic, -even if a system crashes or power failure occurs in the middle of the -update, meaning that repository content is protected even during severe -malfunctions. +SQL language. SQLite updates are atomic, so even in the event of a system +crash or power failure the repository content is protected. Fossil uses three separate classes of SQLite databases: <ol> <li>The configuration database @@ -48,11 +46,11 @@ </ol> The configuration database is a one-per-user database that holds global configuration information used by Fossil. There is one repository database per project. The repository database is the -file that people are normally referring to when they say +file that people are normally referring to when they say "a Fossil repository". The checkout database is found in the working checkout for a project and contains state information that is unique to that working checkout. Fossil does not always use all three database files. The web interface, @@ -134,11 +132,11 @@ instead of a dot) and is located in the directory specified by the LOCALAPPDATA, APPDATA, or HOMEPATH environment variables, in that order. <h3>2.2 Repository Databases</h3> -The repository database is the file that is commonly referred to as +The repository database is the file that is commonly referred to as "the repository". This is because the repository database contains, among other things, the complete revision, ticket, and wiki history for a project. It is customary to name the repository database after then name of the project, with a ".fossil" suffix. For example, the repository database for the self-hosting Fossil repository is called "fossil.fossil" @@ -145,11 +143,11 @@ and the repository database for SQLite is called "sqlite.fossil". <h4>2.2.1 Global Project State</h4> The bulk of the repository database (typically 75 to 85%) consists -of the artifacts that comprise the +of the artifacts that comprise the [./fileformat.wiki | enduring, global, shared state] of the project. The artifacts are stored as BLOBs, compressed using [http://www.zlib.net/ | zlib compression] and, where applicable, using [./delta_encoder_algorithm.wiki | delta compression]. The combination of zlib and delta compression results in a considerable @@ -158,26 +156,26 @@ combined zlib and delta compression, that content only takes up 51.4 MB of space in the repository database, for a compression ratio of about 33:1. Note that the zlib and delta compression is not an inherent part of the -Fossil file format; it is just an optimization. +Fossil file format; it is just an optimization. The enduring file format for Fossil is the unordered set of artifacts. The compression techniques are just a detail of how the current implementation of Fossil happens to store these artifacts efficiently on disk. All of the original uncompressed and undeltaed artifacts can be extracted -from a Fossil repository database using +from a Fossil repository database using the [/help/deconstruct | fossil deconstruct] command. Individual artifacts can be extracted using the [/help/artifact | fossil artifact] command. When accessing the repository database using raw SQL and the [/help/sqlite3 | fossil sql] command, the extension function "<tt>content()</tt>" with a single argument which is the SHA1 hash of an artifact will return the complete undeleted and uncompressed -content of that artifact. +content of that artifact. Going the other way, the [/help/reconstruct | fossil reconstruct] command will scan a directory hierarchy and add all files found to a new repository database. The [/help/import | fossil import] command works by reading the input git-fast-export stream and using it to construct @@ -185,11 +183,11 @@ <h4>2.2.2 Project Metadata</h4> The global project state information in the repository database is supplemented by computed metadata that makes querying the project state -more efficient. Metadata includes but information such as the following: +more efficient. It includes information such as: * The names for all files found in any checkin. * All check-ins that modify a given file * Parents and children of each checkin. * Potential timeline rows. @@ -199,15 +197,15 @@ * Attachments and the wiki pages or tickets they apply to. * Current content of each ticket. * Cross-references between tickets, checkins, and wiki pages. The metadata is held in various SQL tables in the repository database. -The metadata is designed to facilitate queries for the various timelines and -reports that Fossil generates. +It is designed to facilitate queries for the various timelines and +reports that Fossil generates. As the functionality of Fossil evolves, the schema for the metadata can and does change from time to time. -But schema changes do no invalidate the repository. Remember that the +But schema changes do not invalidate the repository. Remember that the metadata contains no new information - only information that has been extracted from the canonical artifacts and saved in a more useful form. Hence, when the metadata schema changes, the prior metadata can be discarded and the entire metadata corpus can be recomputed from the canonical artifacts. That is what the @@ -223,11 +221,11 @@ by [/help/sync | fossil sync]. That is because it is entirely reasonable that two different websites for the same project might have completely different display preferences and user communities. One instance of the project might be a fork of the other, for example, which pulls from the other but never pushes and extends the project in ways that the keepers of -the other website disapprove of. +the other website disapprove of. Display and processing information includes the following: * The name and description of the project * The CSS file, header, and footer used by all web pages @@ -240,31 +238,31 @@ global values defined in the per-user configuration database. Though the display and processing preferences do not move between repository instances using [/help/sync | fossil sync], this information can be shared between repositories using the -[/help/config | fossil config push] and +[/help/config | fossil config push] and [/help/config | fossil config pull] commands. The display and processing information is also copied into new repositories when they are created using [/help/clone | fossil clone]. <h4>2.2.4 User Credentials And Privileges</h4> Just because two development teams are collaborating on a project and allow -push and/or pull between their repositories does not mean that they +push and/or pull between their repositories does not mean that they trust each other enough to share passwords and access privileges. Hence the names and emails and passwords and privileges of users are considered private information that is kept locally in each repository. Each repository database has a table holding the username, privileges, and login credentials for users authorized to interact with that particular -database. In addition, there is a table named "concealed" that maps the +database. In addition, there is a table named "concealed" that maps the SHA1 hash of each users email address back into their true email address. The concealed table allows just the SHA1 hash of email addresses to be stored in tickets, and thus prevents actual email addresses from falling -into the hands of spammers who happen to clone the repository. +into the hands of spammers who happen to clone the repository. The content of the user and concealed tables can be pushed and pulled using the [/help/config | fossil config push] and [/help/config | fossil config pull] commands with the "user" and "email" as the AREA argument, but only if you have administrative @@ -276,11 +274,11 @@ project - is intended to be an append-only database. In other words, new artifacts can be added but artifacts can never be removed. But it sometimes happens that inappropriate content can be mistakenly or maliciously added to a repository. When that happens, the only way to get rid of the content is to [./shunning.wiki | "shun"] it. -The "shun" table in the repository database records the SHA1 hash of +The "shun" table in the repository database records the SHA1 hash of all shunned artifacts. The shun table can be pushed or pulled using the [/help/config | fossil config] command with the "shun" AREA argument. The shun table is also copied during a [/help/clone | clone]. -- you will become rich and famous unless you don't. _______________________________________________ fossil-users mailing list fossil-users@lists.fossil-scm.org http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users