Le Fri, Jan 08, 2021 at 09:38:50AM +0100, Alexandre Duret-Lutz a écrit :
> 
> Recent versions of media-types (1.1.0 and 2.0.0) have introduced some
> lines with extension appearing in both lowercase and uppercase form:

Hi Alexandre, thank you for your report,

I was also wondering about case sensitivity when I worked on the 2.0.0
update.  One of my current problems is that there does not seem to be
a written specification for such details in /etc/mime.types.

> audio/AMR                                       amr AMR
> audio/AMR-WB                                    awb AWB
> audio/EVRC-QCP                                  qcp QCP

The IANA assignment pages for these three types list both the lower- and
the uppercase suffixes, so I decided to stick to that.

> image/t38                                       t38 T38

When I merged the mime.types from Fedora in Debian's, Fedora had the
lowercase one and Debian the uppercase one.  The IANA assignment
declares no extension.  I decided to keep both.  Perhaps it is not
the best decision.

> image/vnd.globalgraphics.pgb                    PGB pgb

This type also declares both cases in IANA's assignment.

> Some tools will complain if some entries in mime.types have duplicate
> extensions (in some case-insensitive sense).  For instance the above
> lines are causing Bug#979232 for lighttpd.

I am glad that you could solve the bug easily on lighttpd's side.  By
the way I am preparing an update that reverts all case-sensitive
duplicates for this release cycle, and will surely do the same for
case-insensitive ones if it causes serious bugs elsewhere. 

> So the question is what is the intended semantics for the above lines?
> Is "audio/AMR amr AMR" really meant to achieve more than "audio/AMR amr"?

The problem here is that I have no comprehensive information on how
softwares use the mime.types files.  I can not rule out that some use
case sensitivity for their own good reasons, so if no other bug arise, I
would like to continue to stick to the information provided by the IANA.

> Also note that mime.types lists some extensions with only uppercase
> versions, or a mix of lower and upper case letters:
> 
> application/vnd.sar                             SAR
> application/vnd.ves.encrypted                   VES

These two I imported from Fedora and I checked that they are consistent
with IANA.

> application/x-font-pcf                          pcf pcf.Z

This one has been for a long time in Debian.

> Those entries will behave differently for "see" and Python's
> "mimetypes.guess_type()".   For instance "see" will consider "foo.sar"
> as application/vnd.sar, but "mimetypes.guess_type()" will not.

I can not tell which approach is wiser...  The mime.types file is not
comprehensive and is usually lagging.  What if there is another file
format around that uses the lowercase `sar` extension?

> It would be nice to clarify the semantics in the comments at the top of
> mime.types.

Definitely! I hope to do so or write a proper man page after I dig the
history of that file.

Bonne journée,

Charles

-- 
Charles Plessy                         Nagahama, Yomitan, Okinawa, Japan
Tooting from work,           https://mastodon.technology/@charles_plessy
Tooting from home,                 https://framapiaf.org/@charles_plessy

Reply via email to