Hi Graham,

I'm also wondering about your combination of content-type and internal
packaging format. The media feature description framework [2] was
intended to capture this kind of combination of features in a more
structured fashion. Thus, I would imagine something like:

Accept-media-features:
(| (& type="application/zip" package=METSDSpaceSIP);q=1.0
(& type="application/atom+xml" atomtype=entry package=AtomSIP);q=0.8 )

This would require IANA registration of the new header field, and a new
media features called "package" and "atomtype", per [4]. Feature "type"
is already registered [3].

[2] http://tools.ietf.org/html/rfc2533

[3] http://tools.ietf.org/html/rfc2913

[4] http://tools.ietf.org/html/rfc2506

This seems like the "proper" way to do it, and just comes with the
same caveats as above. What is the status of this RFC? Are we safe to
go ahead and use it? (incidentally, we wouldn't need to register
atomtype, as the param type=entry is part of the mimetype already).

The RFCs are "proposed standard", which means they are at the first step
on the standards track. Some of them have been used in the Internet fax
work. But that said, I think that to date there is little implementation
experience. The expression format is based on LDAP search filters, and
is easy to parse.

Do you have a ref to any code that can already do this? I looked into the Python LDAP bindings but these search filters are all passed around as strings and I don't see any good ways of getting programmatic access to the media feature sets for free.

Cheers,

Richard



I remember having some discussions several years ago with someone from
the team that was standardizing SIP who were looking at using this, and
they were concerned that the full-blown media feature matching was too
complex. What we discussed at the time was to define a profile that
restricted the form of feature expression to union of conjunction (aka
disjunctive normal form) - which is also adopted imn the example I gave
previously. I'm not sure what came of those discussions, but I think
this would be a reasonable path, which keeps the early implementations
simple but does leave a possibility of introducing more complex matching
patterns later if required (subject to migration constraints).

(It's roughly equivalent to an ALC description logic, but I didn't know
about that at the time; the full feature expression matching is
basically a structural subsumption reasoner.)

I'd be interested in the rest of the advisory panel's and project
team's opinions on this, as this may well define the way that I work
for the duration of the project.

Initially, I think we need to be clear about the requirements.

(BTW, if we get some implementation experience, I may attempt to
progress the specs to Draft Standard status.)

#g
--


Richard Jones wrote:
Hi Folks,

Thanks, this is really great stuff.

On 10/01/11 16:05, Robert D. Sanderson wrote:
On 7 January 2011 17:36, Richard Jones<rich...@oneoverzero.com>
wrote:
2/ Define an extension to the application/zip mimetype which allows
us to
specify the package format as a parameter. So we could, for example,
specify
a parameter "swordpackage" which can take the URI of a package
format, and
construct mimetypes like

application/zip;swordpackage=uri:METSDSpaceSIP
(see http://tools.ietf.org/html/draft-ietf-atompub-typeparam-00)

The questions here are: is this a legitimate extension/approach,
would
this
break anything else on the web in general, and is it naive to
assume that
all packages have the top level mimetype of application/zip?


First of all, no, it's not legitimate to invent new parameters for
existing mime types.

RFC 2048, Section 2.2.3
... the names, values, and meanings of any parameters must
be fully specified when a media type is registered in the IETF tree
...

http://www.faqs.org/rfcs/rfc2048.html

So it's not legal to create a parameter swordpackage and attach it to
the
existing application/zip.

Ok, it sounds like this option then is simply out altogether.

More generally, the HTTP specification defines the accept headers as:
http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html

Note that the extension parameter here is for the header, not the
mimetype.
The BNF allows the accept-extension rule ONLY after the mandatory q
value
in accept-params.

I have to confess to having overlooked the accept-extension rule, as
there wasn't an example of usage in that document. Can you give us an
example as to how that is used?

Which means using Accept-Encoding in this way is potentially
problematic,
but Accept does have provision that would make this use legitimate.

Like mime types, content encodings also have a registry.
See: http://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html section 3.5

Basically, there are two routes to avoid breaking the rules, neither
easy:

1. Register new Mime Types for every packaging format.

2. Use an x- header and eventually write an RFC to standardize it.

We looked at this in both SRU (eg what it would take to have a wrapper
format and an internal format: SRU vs Atom, wrapping Simple DC vs
METS)
and in conjunction with the digital format registry for preservation
purposes.

So my recommendation would be to go with a new x- header, and if/when
the
community has implemented it, take it to an RFC.

It's looking like a separate header is the way to do this, with the
following couple of options immediately standing out:

Accept-Features (or X-Accept-Features if it isn't sufficiently
official)
X-Packaging
X-Accept-Packaging (which I just made up for the purposes of this
discussion)

Some comments on these:

Accept-Features
Having looked at the document [1] (thanks Graham (K)) it looks like it
would give us the leeway that we need to describe requirements while
ensuring that Graham (T)'s concerns (which I share) about matching up
package format requirements with mimetypes would be dealt with. On the
other hand, this document is 12/13 years old and the header has not
made it into the HTTP content negotiation documentation and is
significantly different in format to all the other Accept- headers. It
could also be a substantial effort for servers to implement the full
requirements of this header.

X-Packaging
I'm against using this in this way as it is already used to alert the
server during POST as to the package format that is being supplied.
The format of the header for content negotiation would have to be
totally different to this usage: a list of package formats and q
values for example, rather than a single definitive URI. I see scope
for confusion.

X-Accept-Packaging
Given my concerns about X-Packaging and the comments above about
Accept-Feature, perhaps there is a middle ground that we can define
which does something more minimal with just mimetypes, package formats
and q values in a way similar to having a mimetype that has added
parameters.

For example:
Accept: application/zip; q=1.0, application/atom+xml;type=entry;q=0.8

X-Accept-Packaging: application/zip;{package=METSDSpaceSIP};q=1.0,
application/atom+xml;type=entry;{package=AtomSIP};q=0.8

Or some other suitably neat and unambiguous serialisation which is in
line with how the other Accept- headers work and also gives us the
information we want in a totally definitive mimetype<->package format
way. This could be supplied alongside the usual Accept header so that
clients which can't generate the X-Accept-Packaging header can fall
back easily to the usual content negotiation route.

Thoughts?

Cheers,

Richard

[1] http://www.ietf.org/rfc/rfc2295.txt










Reply via email to