On Jun 25, 2008, at 10:45 AM, Denis Washington wrote:

On Wed, 2008-06-25 at 10:12 -0400, Jeff Johnson wrote:
On Jun 24, 2008, at 1:38 PM, Denis Washington wrote:



Sound like a plan? My primary goals here are two-fold:

1) avoiding disasters if bogus headers start to be added to an rpmdb.

2) exposing rpmdbAdd() (and rpmdbRemove()) methods for use by
LSB/ISV/whatever applications that wish to register/unregister
software
on RPM managed systems.

Sounds like a good plan, yeah. I'm glad being able to work with you on
this, as you certainly have a LOT more experience than me concerning
this. Thank you very much!


No problem.

Enumerating the necessary data elements that need to be present
in a RPM header, and choosing _SOME_ representational markup,
would seem to be on the critical path.

(aside) dpkg its really the same fundamental problem, but a different
target metadata representation. Ditto _your_package_manager_here
for all instances of class.

There are several existing representations of "package" manifests,
both explicit and/or implicit that can be used to enumerate the
necessary data elements to be included in the target metadata
representation
(note I did not say "rpmdb").

Simplest by far is find(1) output of a tree. i.e. an explicit list
of paths to files, with stat(2) and digest (and acls/xattrs and selinux
file contexts and whatever else is needed) implicitly derived from
the tree.

With "implicitly derived", do you mean "read from the installed files
instead of being explicitly in the manifest file"?


Yes. Basically I mean populating target metadata with stat(2)
info, not with explicitly parsed values.

The advantage is KISS: it don't come any simpler (for ISV's and other lusers)
than providing a file manifest.

The disadvantage of a KISS file manifest is that indeed, the files must be
    1) actually present (and available) on a file system
2) correctly installed. Presumably the ISV (or other installer) is functional, or the ISV (or other installer) would not be trying to register a "package", would it?

Other soft "branding" identification information, like vendor,
packager, description,
build host, etc would need to be added to the list of paths. While
all of that
information may be vitally important to ISV's and LSB and installer
GUI's,
all that rpmlib needs is NEVRA (N==name, E==epoch, etc), and mostly for
human identification rather than installer functioning purposes.

Not sure if epoch versioning is important for third-party software, if I
understand correctly it is more of a disto tool for changing package
names etc. It might be safe to set always set the epoch to some default
value. But maybe it also makes sense, you may know a good use case for
it.

I ignored the revision and set it to 1, but revisions could be quite
handy for ISVs too.


I speak in RPM NEVRA jargon, apologies.

Whether LSB "version" contains an Epoch: (or not) simply does not matter.

Always having Epoch: 0 (or equivalently, never including RPMTAG_EPOCH)
are all that is needed for identification purposes of "packages" using RPM target
metadata.

In fact, "version" and even "name" can be synthesized if/when/where necessary. Presumably human lusers need more than "" as an identification tag. The "" string in RPMTAG_NAME
and RPMTAG_VERSION etc is more than adequate to prevent rpmdb disasters.

But clearly better needs to be specified with "version" and "upgrade" and ...

Is a find(1) path list "gud enuf" as a starting point? Or do you want
to establish
other, alternative, markup for expressing the necessary data elements.

If you mean what I thought you meant, that would be OK. And another
question: do you mean to take the _data_ that is in a find(1) path list,
or also its _format_, abadoning the XML representation? The current
format is already a path list with some metadata added.

Other obviously complete and unsurprising candidates to describe
necessary
data elements to be included in target metadata are "tar tvf" and/or
"ls -al".
Those formats are explicit, no data is implicitly derived from stat
(2) of a file,
and the file does not have to exist in order to construct a
representation
of target metadata.

I would go with the simple path list. With explicit stat data etc, we
run into the problem that the data in the manifest might run out of sync with the installed files (as the files may change them during install). Implicit stat data also means less changes in existing installers, which
most likely already do chmod's etc.


I hear "simple path list".

Yes there are many issues with implicit file metadata, all well known.

No matter what, a simple file list is the bare minimum expression of target metadata. Without file paths, one has only disk blocks for ISV's to sell. I undersand
that some disk manufacturer had a monoply selling disk blocks
15 years ago ...

(aside) that's a very dry & obscure joke, don't worry if it makes no sense.

But there's lots and lots of other markups that could/should be used
instead.

What representation of target metadata works for you?

From the content, find(1) path lists would be the best IMHO. We could
also take its representation (that is, a file with newline-separated
files with somehow marked up metadata in front), but I think XML is
pretty nice because it is well-defined and relatively easy to parse.
Note that the backends don't have to deal with the manifest file format
as they already get the parsed binary representation of the manifest.


I change my question(s) to
    What representation(s) work for you?
    Which representation first? Which representation second? etc etc

We can't have the bikeshed discussions about whose metadata is better
without choices, now can we?

Personally, its easier for me to write a parser for Yet Another Form of Perfect Spewage than it is to try to understand whatever reason(s) there are for using same. YMMV.

But if you want to design data structures to separate parsing from packing with your RPM back-end, that works too. I'm just trying to create a header for
inclusion into a rpmdb. That assumes some content. And a well-defined
explicit markup permits efficient communication of what data is needed,
and where the content will be mapped into the target metadata store.

73 de Jeff
______________________________________________________________________
RPM Package Manager                                    http://rpm5.org
LSB Communication List                                rpm-lsb@rpm5.org

Reply via email to