Okay, here are distilled rules, which we must, at least stay within
Format of a mimetype
type/subtype; name1=value1; name2=value2
media-type = type "/" subtype *( ";" parameter )
type = token
subtype = token
parameter = attribute "=" value
attribute = token
value = token | quoted-string
Axioms:
the order of the name/value pairs are not important
each name/value pair is separated with whitespaces.
Tokens cannot contain whitespaces
values can be case sensitive
All values, except the last, should end in a ;
Okay, the rules for mime-types are a bit more complex than I originally
thought.
1. Should we implement these rules in the rules for the content model,
in order to allow people to validate their mimetype-specifications? Ie.
to avoid having a content model require that objects used a wrongly
formatted mimetype.
Next, to compare if two mime-types are equals.
Basically, compare textual until the first ;
then split the remainder on ;
split each split on =
sort on the split names.
Compare the two split-lists
Now, how would this help the original problem? Steve wanted to have a
more specific specification in the data object than in the content
model. Generally, we would need to create an inheritance tree for the
mime-types.
If the content model requires text/plain, then text/xml should be ok. We
will also need to define alike mime types, such as text/xml and
application/xml. There is a sizable document about this in
http://tools.ietf.org/html/rfc3023
If the content model declared a mimetype parameter, it should be
required in the data objects, but the data objects should be allowed to
have additional parameters?
Should the content model have a way to specify that the dataobject
should have the exact mimetype, and not having additional parameters?
Should we implement the same lenient rules for format_uri? Ie. should
the validator understand about how some format uris can be descendants
of each other?
This is just some thoughts on the issue. I do feel that the current
design, where you can specify a number of mimetypes in the content
model, and the object is valid if at least one of them matches, is fine
for all usecases I can think of.
Regards
On 12/19/2011 04:45 PM, Stephen Bayliss wrote:
Hi Asger
That's a good point. Presumably a definition as per
http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.17 with
the media types as per
http://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html#sec3.7
So from that the main type has to go first and my guess is that the
order of the parameters is not important. Though I wonder if there
are further levels of complexity; eg could charset be utf-8 and UTF-8
and both be equivalent? (It looks like utf-8 is the canonical form
though for text/xml)
Steve
-----Original Message-----
*From:* Asger Askov Blekinge [mailto:a...@statsbiblioteket.dk]
*Sent:* 19 December 2011 14:29
*To:* fedora-commons-users@lists.sourceforge.net
*Subject:* Re: [fcrepo-user] ECM validation of MIMETYPE
Yes, but we would need to specify some rules then.
charset is not the only subtype allowed, I do believe this is an
openended set. I do know people have been using "version" as well.
So, I would need to know how to split a mime-type and if the order
of the subtypes are important?
Secondly, you can of course specify multiple form statements in
the content model, the requirement is just that ONE of them match.
So, specify the various allowed charsets, and one without charset,
and you should be safe.
Regards
On 12/15/2011 01:03 PM, Stephen Bayliss wrote:
As far as I can tell, ECM validation of a datastream’s MIMETYPE
is strict – the entire MIMETYPE property contents have to match
that declared in the content model.
What about the case where one might want to specify the MIMETYPE
of a datastream in the CModel, but not the character set? If I
specify MIMETYPE as “text/xml” in the CModel and as “text/xml;
charset=UTF-8” in the object, it fails validation.
Would it make sense to only validate charset if it is defined in
the content model?
Regards
Steve
------------------------------------------------------------------------------
Learn Windows Azure Live! Tuesday, Dec 13, 2011
Microsoft is holding a special Learn Windows Azure training event for
developers. It will provide a great way to learn Windows Azure and what it
provides. You can attend the event by watching it streamed LIVE online.
Learn more at http://p.sf.net/sfu/ms-windowsazure
_______________________________________________
Fedora-commons-users mailing list
Fedora-commons-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/fedora-commons-users