On Wed, 2008-06-25 at 11:26 -0400, Jeff Johnson wrote:
> On Jun 25, 2008, at 10:45 AM, Denis Washington wrote:
> 
> > On Wed, 2008-06-25 at 10:12 -0400, Jeff Johnson wrote:
> >> On Jun 24, 2008, at 1:38 PM, Denis Washington wrote:
> >>
> >>>
> >>>
> >>>> Sound like a plan? My primary goals here are two-fold:
> >>>>
> >>>> 1) avoiding disasters if bogus headers start to be added to an  
> >>>> rpmdb.
> >>>>
> >>>> 2) exposing rpmdbAdd() (and rpmdbRemove()) methods for use by
> >>>> LSB/ISV/whatever applications that wish to register/unregister
> >>>> software
> >>>> on RPM managed systems.
> >>>
> >>> Sounds like a good plan, yeah. I'm glad being able to work with  
> >>> you on
> >>> this, as you certainly have a LOT more experience than me concerning
> >>> this. Thank you very much!
> >>>
> >>
> >> No problem.
> >>
> >> Enumerating the necessary data elements that need to be present
> >> in a RPM header, and choosing _SOME_ representational markup,
> >> would seem to be on the critical path.
> >>
> >> (aside) dpkg its really the same fundamental problem, but a different
> >> target metadata representation. Ditto _your_package_manager_here
> >> for all instances of class.
> >>
> >> There are several existing representations of "package" manifests,
> >> both explicit and/or implicit that can be used to enumerate the
> >> necessary data elements to be included in the target metadata
> >> representation
> >> (note I did not say "rpmdb").
> >>
> >> Simplest by far is find(1) output of a tree. i.e. an explicit list
> >> of paths to files, with stat(2) and digest (and acls/xattrs and  
> >> selinux
> >> file contexts and whatever else is needed) implicitly derived from
> >> the tree.
> >
> > With "implicitly derived", do you mean "read from the installed files
> > instead of being explicitly in the manifest file"?
> >
> 
> Yes. Basically I mean populating target metadata with stat(2)
> info, not with explicitly parsed values.
> 
> The advantage is KISS: it don't come any simpler (for ISV's and other  
> lusers)
> than providing a file manifest.
> 
> The disadvantage of a KISS file manifest is that indeed, the files  
> must be
>      1) actually present (and available) on a file system
>      2) correctly installed. Presumably the ISV (or other installer)  
> is functional, or
>      the ISV (or other installer) would not be trying to register a  
> "package", would it?
> 
> >> Other soft "branding" identification information, like vendor,
> >> packager, description,
> >> build host, etc would need to be added to the list of paths. While
> >> all of that
> >> information may be vitally important to ISV's and LSB and installer
> >> GUI's,
> >> all that rpmlib needs is NEVRA (N==name, E==epoch, etc), and  
> >> mostly for
> >> human identification rather than installer functioning purposes.
> >
> > Not sure if epoch versioning is important for third-party software,  
> > if I
> > understand correctly it is more of a disto tool for changing package
> > names etc. It might be safe to set always set the epoch to some  
> > default
> > value. But maybe it also makes sense, you may know a good use case for
> > it.
> >
> > I ignored the revision and set it to 1, but revisions could be quite
> > handy for ISVs too.
> >
> 
> I speak in RPM NEVRA jargon, apologies.
> 
> Whether LSB "version" contains an Epoch: (or not) simply does not  
> matter.
> 
> Always having Epoch: 0 (or equivalently, never including RPMTAG_EPOCH)
> are all that is needed for identification purposes of "packages"  
> using RPM target
> metadata.

I think Epoch:0 for all packages would be OK. I don't see where
third-party app vendors would really need epochs.

> In fact, "version" and even "name" can be synthesized if/when/where  
> necessary. Presumably
> human lusers need more than "" as an identification tag. The ""  
> string in RPMTAG_NAME
> and RPMTAG_VERSION etc is more than adequate to prevent rpmdb disasters.

As already stated, "version" needs some specified format and a
guaranteed comparison scheme (as package managers seem to handle this
differently in some cases). IMHO the package name should be as defined
in the LSB, that is, lsb-<provider>-<name>.

> But clearly better needs to be specified with "version" and "upgrade"  
> and ...
> 
> >> Is a find(1) path list "gud enuf" as a starting point? Or do you want
> >> to establish
> >> other, alternative, markup for expressing the necessary data  
> >> elements.
> >
> > If you mean what I thought you meant, that would be OK. And another
> > question: do you mean to take the _data_ that is in a find(1) path  
> > list,
> > or also its _format_, abadoning the XML representation? The current
> > format is already a path list with some metadata added.
> >
> >> Other obviously complete and unsurprising candidates to describe
> >> necessary
> >> data elements to be included in target metadata are "tar tvf" and/or
> >> "ls -al".
> >> Those formats are explicit, no data is implicitly derived from stat
> >> (2) of a file,
> >> and the file does not have to exist in order to construct a
> >> representation
> >> of target metadata.
> >
> > I would go with the simple path list. With explicit stat data etc, we
> > run into the problem that the data in the manifest might run out of  
> > sync
> > with the installed files (as the files may change them during  
> > install).
> > Implicit stat data also means less changes in existing installers,  
> > which
> > most likely already do chmod's etc.
> >
> 
> I hear "simple path list".
> 
> Yes there are many issues with implicit file metadata, all well known.
> 
> No matter what, a simple file list is the bare minimum expression of  
> target metadata.
> Without file paths, one has only disk blocks for ISV's to sell. I  
> undersand
> that some disk manufacturer had a monoply selling disk blocks
> 15 years ago ...
> 
> (aside) that's a very dry & obscure joke, don't worry if it makes no  
> sense.

Good. For one split second I thought I had no humor.

> >> But there's lots and lots of other markups that could/should be used
> >> instead.
> >>
> >> What representation of target metadata works for you?
> >
> >> From the content, find(1) path lists would be the best IMHO. We could
> > also take its representation (that is, a file with newline-separated
> > files with somehow marked up metadata in front), but I think XML is
> > pretty nice because it is well-defined and relatively easy to parse.
> > Note that the backends don't have to deal with the manifest file  
> > format
> > as they already get the parsed binary representation of the manifest.
> >
> 
> I change my question(s) to
>      What representation(s) work for you?
>      Which representation first? Which representation second? etc etc
> 
> We can't have the bikeshed discussions about whose metadata is better
> without choices, now can we?

True.

My first choice would be a path list, with stat data etc being implicit.
That's also how the current package manifest format is designed. (By the
way, it is impossible for one of the files to not exist; empty "stub
files" are created on the specified paths by the daemon and the parent
directories are read-only, so the installer cannot remove them.) A
representation with explicit file metadata is sub-optimal because it
assumes that the installer doesn't do as specified in the manifest. We
would find ourselves to stat all files anyway to see if the provided
information is correct - thus, we can as well leave that metadata out of
the manifest file.
  
> Personally, its easier for me to write a parser for Yet Another Form  
> of Perfect Spewage
> than it is to try to understand whatever reason(s) there are for  
> using same. YMMV.
> 
> But if you want to design data structures to separate parsing from  
> packing
> with your RPM back-end, that works too. 

No parsing to be done by the backend. There are data structures already,
see the spec and the RPM back-end code. They contain the package name,
version, and arch (the format of the 'arch' value must be specified
still - aren't there target architecture names specified in the LSB?),
and a linked list with all files (so basically a find(1) path list).

> I'm just trying to create a  
> header for
> inclusion into a rpmdb. That assumes some content. And a well-defined
> explicit markup permits efficient communication of what data is needed,
> and where the content will be mapped into the target metadata store.

I hope what is in the data structures is sufficient and well-defined
enough. And, what I increasingly tempt to believe, that we don't talk
past each other. ;)

Regards,
Denis

______________________________________________________________________
RPM Package Manager                                    http://rpm5.org
LSB Communication List                                rpm-lsb@rpm5.org

Reply via email to