Greetings,

So, as part of the catalog work that's happening, it has become clear that the repository and client metadata needs to be reorganised (at a relatively high-level; this doesn't conflict with Brock's fanout work).

In particular, we're open to namespace collisions when storing package manifests because publisher is not part of the scheme used to store files (e.g. pkg/foo/1.0 manifest from publisher A and pkg/foo/1.0 manifest from publisher B).

The current structure also makes certain repository operations or cache management policies difficult or expensive to perform.

The below proposal is an attempt to resolve this, while unifying the client and server storage schemes.

=============================
Current Server Storage Scheme
=============================
The server's repository root contains these directories:

  catalog (repository catalog data)
  file (package data)
  index (search indices)
  pkg (manifests)
  trans (in-flight transactions)
  updatelog (obsolete with catalog v1 work)

The server stores manifests this way:
<REPO_ROOT>/pkg/<stem>/<manifest-named-after-uri-encoded-version>

=============================
Current Client Storage Scheme
=============================
The client's image root contains these directories:

  cert (client certificates; not publisher specific? or sometimes?)
  download (client file cache)
  gui_cache (packagemanager data cache)
  history (client operation logs)
  index (search indices)
  pkg (package manifests)
  publisher (publisher catalogs and other metadata)
  state (catalogs for known and installed packages)

The client stores manifests this way:
<IMG_ROOT>/pkg/<stem>/<directory-named-after-uri-encoded-version>/

...where the version directory contains all of the manifest.* files.

==============================
Proposed Server Storage Scheme
==============================
<REPO_ROOT>/
  __catalog/
  __index/
  <publisher>/
    file/
    pkg/
      <stem>/
        <manifest-named-after-uri-encoded-version>
    trans/

==============================
Proposed Client Storage Scheme
==============================
<IMG_ROOT>/
  __cert/
  __history/
  __index/
  __pm_cache/
  __state/
  <publisher>/
    file/ (formerly download)
    pkg/
      <stem>/
        <manifest-named-after-uri-encoded-version>
    pkg_cache/
      <stem>/
        <uri-encoded-version>.<cache_name> (manifest cache file)

=====
Notes
=====
All directories except those for specific publishers intentionally start with '__', as publisher prefixes are not allowed to have a '_' as their first character.

Unifying the storage scheme used by the client and server has several benefits:

* it becomes relatively trivial to use the client's image root as a repository mirror

* it prevents namespace collisions for package manifests

* it becomes easier (faster as well) to do things like Repo1 + Repo2; especially when the repositories involved are for different publishers

* moves us closer to an on-disk format

=================
Unresolved issues
=================
* upgrade process specifics for pkg(1)

* upgrade process specifics for pkg.depotd

Cheers,
--
Shawn Walker
_______________________________________________
pkg-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/pkg-discuss

Reply via email to