URL work in HTML 5

2012-10-13 Thread Larry Masinter
I know there are a lot of private conversations about this, but I'd like to 
try, in the time frame of the next W3C TPAC and IETF meetings, to work out a 
solution to the issue of forking the URL specifications. Does everyone know 
what the issues are?
Is everyone willing to talk about solutions?

I think forking is harmful and unnecessary. 

Bcc:  public-ietf-...@w3.org  IETF W3C Liaison
   public-webapps@w3.org W3C Web Applications group chartered to work 
on something in W3C URL releated
 www-...@w3.org W3C Technical Architecture Group, since we discussed 
it
 public-...@w3.org mailing list of IETF IRI working group, 
responsible for IRI spec

Did I leave anyone out?

Larry
--
http://larry.masinter.net




Re: Test suites and RFC2119

2011-07-18 Thread Larry Masinter
It's best to view RFC 2119 in the context of IETF rules for interoperability:

Progression along standards track depends on there being multiple independent 
interoperable implementations of every feature.

While feature is not clearly defined, I believe that in a well-written 
specification, any normative language (MUST, MAY, SHOULD) is associated with a 
feature.

Two implementations interoperate and are interoperable implementations of a 
feature if they both implement the feature and interoperate.
interoperate is interpreted slightly differently by the context, but if you 
think a web page is an implementation and a browser is an implementation, then 
they interoperate if the browser does what the author of the web page 
expected.

MAY features are not required to be implemented, but if they are implemented, 
they should operate as described in the spec and the interoperable 
implementations treat the feature as expected.

MUST features are required to be implemented.  A well-written spec only 
mandates MUST in cases where it is required for interoperability.

For test suites, it seems best to treat 'SHOULD' as if it were 'MUST but you 
can apply for a waiver'.

That is, test cases should test for SHOULD, but there are situations or 
contexts where interoperability is accomplished some other way... and that the 
way to test for SHOULD is to test as if it were MUST, but let the 
implementor provide a convincing proof of why not following the normative 
language does not interfere with interoperability.


Larry
--
http://larry.masinter.net



FYI: review of draft-abarth-mime-sniff-03

2010-01-20 Thread Larry Masinter
Since raised on W3C TAG 
http://lists.w3.org/Archives/Public/www-tag/2010Jan/0076.html:

I reviewed draft-abarth-mime-sniff. I'm not sure I found all of the past 
discussion on the document, and I probably got some wrong, but it hasn't been 
updated in quite a while.

I sent the review to apps-discuss (since it deals with non-HTTP sniffing as 
well):

http://www.ietf.org/mail-archive/web/apps-discuss/current/msg01250.html

(discussion on apps-disc...@ietf.orgmailto:apps-disc...@ietf.org)

Since there are several W3C documents advancing that make normative reference 
to this, getting timely review should be a priority.

Larry
--
http://larry.masinter.net




RE: [public-webapps] Comment on Widget URI (7)

2009-12-16 Thread Larry Masinter
(bcc public-webapps since not as relevant)

I actually think the TAG discussions about versioning and the use of version
indicators has been helpful, but it's been hard to drive this to a publication,
because there's still some work to be done. However, I think the main insight
I've had is that version indicators have limited (but non-zero) utility in 
situations where the popular language implementations evolve independently
of published language specifications. Normally, if language implementations
follow language specifications closely, you can use the version number of
the specification as a good indicator of the version number of the language.

However, in situations like HTML, where the implementations have evolved
-- and are likely to continue to evolve -- independently of the versions
of the specifications (and each other), the utility of a version indicator
is more confusing. Users would *like* a version indicator to correspond
to a category of implementation, but the only thing we can give version 
numbers to realistically are versions of specifications instead. So the
utility is limited to controlled situations where the producer of the document
with a version indicator really carefully intends to note a specification 
version, or to cause validation against a particular specification.

I'm not quite sure what PC is, to know how this analysis applies to it.

Larry
--
http://larry.masinter.net

-Original Message-
From: Marcin Hanclik [mailto:marcin.hanc...@access-company.com] 
Sent: Wednesday, December 09, 2009 1:37 PM
To: Larry Masinter; Robin Berjon
Cc: public-webapps@w3.org
Subject: RE: [public-webapps] Comment on Widget URI (7)

Hi Larry,

WOW:
It's a pity you were not involved in the discussions around PC's version 
attribute.

Thanks,
Marcin

From: public-webapps-requ...@w3.org [public-webapps-requ...@w3.org] On Behalf 
Of Larry Masinter [masin...@adobe.com]
Sent: Wednesday, December 09, 2009 7:20 PM
To: Robin Berjon
Cc: public-webapps@w3.org
Subject: RE: [public-webapps] Comment on Widget URI (7)

FWIW, just to be clear:

My comments about versioning and version numbers only apply
to the URI scheme, and not to language specifications in
general.

I haven't reviewed any of the other WebApps documents,
except in the context of reviewing the URI scheme.

In general, I support appropriate use of version numbers in
languages and language specifications, especially since
documents and file formats have ample opportunities for
in-band version indicators. It's unfortunate that URIs,
being compact strings, have no place for version indicators.

Larry
--
http://larry.masinter.net


-Original Message-
From: Robin Berjon [mailto:ro...@berjon.com]
Sent: Thursday, November 19, 2009 4:08 AM
To: Larry Masinter
Cc: public-webapps@w3.org
Subject: Re: [public-webapps] Comment on Widget URI (7)

Dear Larry,

thank you for your comments.

On Oct 10, 2009, at 19:44 , Larry Masinter wrote:
 7) ** EDITORIAL TITLE **
 Widgets 1.0: Widget URIs the 1.0 might imply some kind of versioning, but 
 there is no versioning of URI schemes.

 Suggestion: retitle Widget URIs

I have provisionally made this change. I agree with Marcos that it would be 
good to do so throughout the widget family of specifications, especially since 
there is no reason why versions of its various components need to evolve in 
synchronised fashion - one could use P+C 4.2 with WARP 2.7.

Recommendation to the WG: apply the same change throughout.

--
Robin Berjon - http://berjon.com/







Access Systems Germany GmbH
Essener Strasse 5  |  D-46047 Oberhausen
HRB 13548 Amtsgericht Duisburg
Geschaeftsfuehrer: Michel Piquemal, Tomonori Watanabe, Yusuke Kanda

www.access-company.com

CONFIDENTIALITY NOTICE
This e-mail and any attachments hereto may contain information that is 
privileged or confidential, and is intended for use only by the
individual or entity to which it is addressed. Any disclosure, copying or 
distribution of the information by anyone else is strictly prohibited.
If you have received this document in error, please notify us promptly by 
responding to this e-mail. Thank you.



RE: [public-webapps] Comment on Widget URI (2)

2009-12-09 Thread Larry Masinter
http://tools.ietf.org/html/draft-duerst-iri-bis-07#section-5

gives several different examples of normalization and
comparison of strings for the purpose of identification.

There are significant differences in alternatives for
how to do comparison of Unicode file names.

I can't figure out from the document of the
Widget: URI scheme which, if any, of the comparison
algorithms are recommended. In fact, the assertion
that using UTF-8 is recommended seems like it would
result in ambiguous interpretation of URIs if some
implementations use UTF-8 and others don't.

So, if I have a file named Voß.html and a relative
IRI that points to voss.html, do they match or not?
You say case sensitive, do you mean byte for byte?
Do half-width romaji characters match the full-width
romaji characters?

Note that different operating systems normalize
unicode file names differently.

Perhaps it's necessary to dig further into the
widget spec to insure this is not an ambiguity, but
the question was whether the widget specification
was well-defined, and my comment was that it
didn't seem to be.

Larry
--
http://larry.masinter.net


-Original Message-
From: Robin Berjon [mailto:ro...@berjon.com] 
Sent: Thursday, November 19, 2009 6:00 AM
To: Larry Masinter
Cc: public-webapps@w3.org
Subject: Re: [public-webapps] Comment on Widget URI (2)

Dear Larry,

thank you for your comments.

On Oct 10, 2009, at 19:44 , Larry Masinter wrote:
 2) ** WELL-DEFINED MAPPING TO FILES **
 
 Section 4.4 Step 2 makes normative reference:
 
 http://www.w3.org/TR/widgets/#rule-for-finding-a-file-within-a-widget- 
 
 The algorithm there seems to be lacking a clear definition of matches
 which deals reasonably with the issues surrounding matching and equivalence
 for Unicode strings, or the handling of character sets in IRIs which are
 not represented in UTF8.
 
 Suggestion (Editorial): Move the definition of the mapping algorithm
 into the URI scheme registration document so that its definition can 
 be reviewed for completeness.
 Suggestion (Technical): Define exactly and precisely what match means
 and make it clear what the appropriate response or error conditions are
 if there is more than one file that matches.

This comment concerns P+C, and I'm unsure about what change you are requesting 
where. Could you please provide an example of an issue in the current setup and 
explain how you would like to see it addressed?

-- 
Robin Berjon - http://berjon.com/






RE: [public-webapps] Comment on Widget URI (3)

2009-12-09 Thread Larry Masinter
Your reference to 'every drive-by you should use this! argument'
is mainly irrelevant to my comment and I assume your goal was
to be insulting, alluding to 
http://en.wikipedia.org/wiki/Drive-by_shooting -- unless you have
some other explanation for your intent?

The fact that you got similar requests (that there were
multiple drive-by arguments which were just a rehash
of something we've seen before) would seem to as likely
indicate that there is a significant support for reuse
of other schemes, than as a validation of your position
that you need a new one.

I reviewed the document without having read every other
review, and I think that was appropriate.

You claim Having done due diligence, but that would 
seem to make it easy to trivially supply what I asked for 
and which I cannot infer or guess: a single use case 
where the offered alternative (thismessage in my case)
is inadequate for providing the desired properties of
identifiers and their relation to resources.  
Could you please supply one use case; surely anyone
familiar with the lengthy due diligence you allude
to would have a simple example?

Your previous reply:

  http://lists.w3.org/Archives/Public/public-webapps/2009AprJun/0972.html 

contains interestingly the statement that:

# I think that this demonstrates that, technically speaking,
# reusing thismessage: *can* be done.

It does go on to discuss the costs of doing so, but the
costs are all a matter of writing technical specifications
and updating the thismessage definition to clarify the
ambiguities which you alluded to, and not technical
impediments. I had frankly taken your previous note as
indicating that you would consider thismessage:/
more carefully.

 Alternate Suggestion: Withdraw registration of widget:
 and reference existing scheme.

 That would leave us with no way of addressing widget resources.
 Having just now implemented a widget runtime, I don't see how we could have 
 interoperability without them.

If you replace the string widget with the string thismessage and remove the
possibility of an (opaque, undefined, and unneeded by any documented use cases)
authority field, the widget runtimes can proceed, and would have a way of
addressing widget resources. There are no apparent use cases where the
the string widget: ever appears in any content. If this isn't the case,
it isn't clear from the definition of the URI scheme. Rather, it claims

# In general, authors of widget content use relative URI references.
and
# widget URIs identify them only on the inside of a package, irrespective
# of that package's own location.
and that
# Must not require widget developers to be aware of it for basic tasks

Since the references are relative, the scheme name shouldn't matter.
If it does matter, where?

I'll just take your elephant manicures comment as an attempt at
humor.

Larry
--
http://larry.masinter.net


-Original Message-
From: Robin Berjon [mailto:ro...@berjon.com] 
Sent: Thursday, November 19, 2009 5:13 AM
To: Larry Masinter
Cc: public-webapps@w3.org
Subject: Re: [public-webapps] Comment on Widget URI (3)

Dear Larry,

thank you for your comments.

On Oct 10, 2009, at 19:44 , Larry Masinter wrote:
 3) ** Reuse URI schemes **
 
 http://www.w3.org/TR/webarch/#URI-scheme includes   Good practice: Reuse URI 
 schemes
 
 A specification SHOULD reuse an existing URI scheme (rather than create a 
 new one) when it provides the desired properties of identifiers and their 
 relation to resources.

The WebApps WG is well familiar with webarch. In this instance, I would like to 
emphasise when it provides the desired properties of identifiers and their 
relation to resources. The WebApps WG has discussed this topic with luminaries 
and experts in both the TAG and the community at large, and to this date, while 
we have learnt about many obscure and sometimes poorly defined URI schemes, 
none has provided us with a solution.

We've long reached the point where every drive-by you should use this! 
argument is just a rehash of something we've seen before. Having done due 
diligence, I feel confident that we haven't found an existing URI scheme that, 
as per AWWW, provides the desired properties of identifiers and their relation 
to resources.

 The draft suggests there are many other schemes (with merit) already 
 proposed, but that these existing efforts, rather than identify packaged 
 resources from the outside, widget URIs identify them only on the inside of a 
 package, irrespective of that package's own location., but this seems to 
 indicate that the requirements for widget URIs are weaker, not stronger.

Actually that wasn't the intended meaning, but since it can be construed thusly 
(and since you made another comment indicating that it was hard to understand) 
I have removed this section (it was just meant to be informative anyway).

 Suggestion: Supply use cases where reuse of existing schemes (including 
 thismessage:/) do not provide the desired properties

RE: [public-webapps] Comment on Widget URI (7)

2009-12-09 Thread Larry Masinter
FWIW, just to be clear:

My comments about versioning and version numbers only apply
to the URI scheme, and not to language specifications in 
general.

I haven't reviewed any of the other WebApps documents,
except in the context of reviewing the URI scheme.

In general, I support appropriate use of version numbers in
languages and language specifications, especially since
documents and file formats have ample opportunities for
in-band version indicators. It's unfortunate that URIs,
being compact strings, have no place for version indicators.

Larry
--
http://larry.masinter.net


-Original Message-
From: Robin Berjon [mailto:ro...@berjon.com] 
Sent: Thursday, November 19, 2009 4:08 AM
To: Larry Masinter
Cc: public-webapps@w3.org
Subject: Re: [public-webapps] Comment on Widget URI (7)

Dear Larry,

thank you for your comments.

On Oct 10, 2009, at 19:44 , Larry Masinter wrote:
 7) ** EDITORIAL TITLE **
 Widgets 1.0: Widget URIs the 1.0 might imply some kind of versioning, but 
 there is no versioning of URI schemes.
 
 Suggestion: retitle Widget URIs

I have provisionally made this change. I agree with Marcos that it would be 
good to do so throughout the widget family of specifications, especially since 
there is no reason why versions of its various components need to evolve in 
synchronised fashion - one could use P+C 4.2 with WARP 2.7.

Recommendation to the WG: apply the same change throughout.

-- 
Robin Berjon - http://berjon.com/






RE: [public-webapps] Comment on Widget URI (1)

2009-12-07 Thread Larry Masinter
Sorry I missed the messages earlier...

If the purpose of the authority and query components is that they are
supposed to be processed by scripts in pages that use widget URIs,
then the specification should say so. Opaque fields with no semantics
and no identified purpose are not well-defined, in my opinion.

There is some reasonable risk that implementors will take what
is currently defined as opaque in the authority field and use
it for cross-widget references. Without clear definition of these
semantics, to merely leave it as out of scope introduces a
security risk.

If implementations MUST completely ignore the authority field
and MUST treat any reference as if it ONLY applied to the local
widget, then that would address the security concern.

Larry
--
http://larry.masinter.net


-Original Message-
From: Robin Berjon [mailto:ro...@berjon.com] 
Sent: Thursday, November 19, 2009 6:13 AM
To: Larry Masinter
Cc: public-webapps@w3.org
Subject: Re: [public-webapps] Comment on Widget URI (1)

Dear Larry,

thank you for your comments.

On Oct 10, 2009, at 19:44 , Larry Masinter wrote:
 1) ** WELL DEFINED QUERY AND AUTHORITY **
 http://www.w3.org/TR/webarch/#URI-scheme points to RFC 2617, which has been
 replaced by RFC 4395. I think WebArch should be updated to recommend that
 W3C recommendations must use permanent schemes and not provisional ones.

Does this apply in any way to us?

 RFC 4395 requires that permanent scheme definitions be Well-defined. 
 Leaving in syntactic components and declaring them out of scope  is leaving 
 them undefined.

The only parts the semantics of which were flagged as outside the scope were 
fragment and query - this section has been removed.

 Suggestion: Remove 'authority' from the syntax, and any sections that
  refer to them; disallow query components
 Alternate Suggestion: define the meaning of authority and query components.

Neither the authority nor the query components are undefined or out of scope. 
Authority is syntactically defined, and is clearly specified as being devoid of 
semantics (opaque). Stating that this makes the scheme not well-defined is 
untrue - it is like saying that XML Namespaces aren't well-defined because they 
are equally opaque.

The query component is equally defined as to its syntax, and its meaning is 
left to the processor (typically, a script inside an HTML page, but for other 
resources it could be different). I can't see how this differs from the http 
scheme.

-- 
Robin Berjon - http://berjon.com/






RE: [public-webapps] Comments on Widget URI (General)

2009-12-07 Thread Larry Masinter
I'll ask the TAG to review your responses at our F2F this week.
Sorry for the delay.

--
http://larry.masinter.net


-Original Message-
From: Robin Berjon [mailto:ro...@berjon.com] 
Sent: Tuesday, December 01, 2009 1:54 AM
To: Larry Masinter
Cc: public-webapps WG
Subject: Re: [public-webapps] Comments on Widget URI (General)

Hi Larry,

On Nov 19, 2009, at 15:18 , Robin Berjon wrote:
 the WebApps WG deeply thank you for you comments on the widgets URI last 
 call. We decided to split them over several emails that have been posted to 
 the list with proposed responses to them. We would be grateful if you could 
 indicate whether you are satisfied with each resolution within two weeks.

On Thursday it will be two weeks since the WG has sent out its response to your 
comments concerning widget URIs. In the spirit of not having to fall back to 
the rule that silence is assent it would be great if you could indicate whether 
you are satisfied with each proposed resolution. We naturally understand that 
you may be busy, so if time is short we can also discuss pushing the date 
somewhat.

Regards,

-- 
Robin Berjon - http://berjon.com/






RE: [Widget URI] Internationalization, widget IRI?

2009-07-26 Thread Larry Masinter
I'm sorry for the confusion, my email was sent by mistake. I
have not re-reviewed the widget URI scheme since a previous
review several months ago. I was only reacting to something
in your email. I suppose I should re-review the widget URI
scheme document itself, but I haven't. My goal at the moment
is to update the IRI document.

 Why is the Widgets 1.0: URI Scheme about URI and not IRI?

The short answer is that, in general,  one defines URI schemes
and automatically gets something that describes IRIs as well.


 widget-URI  = widget: // [ authority ] / zip-rel-path [ ? query ] [ 
 # fragment ]

 is incorrect (depending on whether you are on byte or character level),
  because zip-rel-path includes non-percent-encoded characters, thus 
 widget-URI is actually an IRI.

I wonder if the URI registration process document should specifically
allow registration forms to describe the URI scheme syntax in terms of
IRI characters.

 What then about naming the specification as Widgets 1.0: IRI Scheme
 and referring to IRIs?

Again, because formally there are no IRI schemes, there are only
URI schemes, even though there are IRIs which can be mapped into
URIs of that scheme.

Larry
--
http://larry.masinter.net




RE: [Widget URI] Internationalization, widget IRI?

2009-07-25 Thread Larry Masinter
(BCC original mailing lists, directing traffic to public-...@w3.org
for IRI issues.
 
There are no IRI schemes. I'm not sure that the draft makes
this clear, or makes clear that although most other parts of
the IRI syntax extend URI syntax, the scheme is the same, cannot
contain any %xx encoded characters because it cannot contain %,
etc.

-Original Message-
From: public-pkg-uri-scheme-requ...@w3.org On Behalf Of Marcin Hanclik
Sent: Friday, July 24, 2009 9:37 AM
To: public-webapps@w3.org
Cc: public-pkg-uri-sch...@w3.org
Subject: [Widget URI] Internationalization, widget IRI?

Hi Robin, All,

Why is the Widgets 1.0: URI Scheme about URI and not IRI?

Widgets 1.0 PC is using only the term/type IRI (URI cannot be found there), 
e.g. for id, href and name attributes.
In Widgets 1.0: URI Scheme (WUS?) document you refer in [1] to zip-rel-path.
It resembles IRI per design, since conversion of the file name field [2], that 
may be specified in UTF-8, to URI would entail percent-encoding [3].
Thus having IRI instead of URI could save processing time/power.
It seems [4] already touches upon the internationalization.

Specifically the ABNF:

widget-URI  = widget: // [ authority ] / zip-rel-path [ ? query ] [ # 
fragment ]

is incorrect (depending on whether you are on byte or character level), because 
zip-rel-path includes non-percent-encoded characters, thus widget-URI is 
actually an IRI.

What then about naming the specification as Widgets 1.0: IRI Scheme and 
referring to IRIs?

Thanks.

Kind regards,
Marcin

[1] http://dev.w3.org/2006/waf/widgets-uri/#syntax
[2] http://dev.w3.org/2006/waf/widgets/#file-name-field0
[3] http://tools.ietf.org/html/rfc3986#section-2.1
[4] http://lists.w3.org/Archives/Public/public-pkg-uri-scheme/2009Jan/.html 
Marcin Hanclik
ACCESS Systems Germany GmbH
Tel: +49-208-8290-6452  |  Fax: +49-208-8290-6465
Mobile: +49-163-8290-646
E-Mail: marcin.hanc...@access-company.com




Access Systems Germany GmbH
Essener Strasse 5  |  D-46047 Oberhausen
HRB 13548 Amtsgericht Duisburg
Geschaeftsfuehrer: Michel Piquemal, Tomonori Watanabe, Yusuke Kanda

www.access-company.com

CONFIDENTIALITY NOTICE
This e-mail and any attachments hereto may contain information that is 
privileged or confidential, and is intended for use only by the
individual or entity to which it is addressed. Any disclosure, copying or 
distribution of the information by anyone else is strictly prohibited.
If you have received this document in error, please notify us promptly by 
responding to this e-mail. Thank you.




RE: [widgets] Widgets URI scheme... it's baaaack!

2009-05-22 Thread Larry Masinter
I didn't think widget had ever gone away.

The document you pointed at says:

 This document is not a specification as of this time, though it is likely to 
become one once consensus has been reached on its fundamental direction. In the 
meantime, this document must be considered to sit outside the Widgets 1.0 
family of specifications and to contain no normative content. 

I'm not sure what the vengeance is. It seems pretty clear that
there must be some requirement that keeps the group from being
comfortable about reusing or extending some existing URI scheme, 
such as using thismessage (RFC2557 defines for multipart/related
but could be reused), file (consider the widget as a little
mounted file system), cid (to pick up the UUID but add a 
path component.

The requirements for new URI schemes for permanent registration 
are given in RFC 4395, and the ones I'd be concerned about here 
are:

* the scheme is well-defined. In particular, terms in the
definition are either part of common usage or are defined in the
document or in referenced document.

* there are requirements that require a new scheme rather
 than reusing an existing one.

* Security considerations are explicit.

If the widget: scheme is intended for inter-package references
then there are security issues with that. If not, then why the UUID?

Larry
--
http://larry.masinter.net



RE: [widgets] Widgets URI scheme... it's baaaack!

2009-05-22 Thread Larry Masinter
What makes a set of widgets related? Is there an attack where
based on UUID knowledge where two unrelated widgets could somehow
appear related?

What existing infrastructure for security are you planning
to reuse? 

Often, security loopholes are introduced when reusing
security infrastructure designed for one context in 
a way that it wasn't designed for.

thismessage:/ basically didn't allow references outside
the package at all. By adding a UUID and alluding to
related packages as possibly being available, widget
might become a vector.

I'm not saying it is, I'm just saying that getting external
review for security mechanisms and assumptions is critical.

Larry
--
http://larry.masinter.net


-Original Message-
From: Arve Bersvendsen [mailto:ar...@opera.com] 
Sent: Friday, May 22, 2009 9:55 AM
To: Larry Masinter; marc...@opera.com; public-pkg-uri-scheme; public-webapps
Subject: Re: [widgets] Widgets URI scheme... it's bck!

On Fri, 22 May 2009 17:29:57 +0200, Larry Masinter masin...@adobe.com  
wrote:

 If the widget: scheme is intended for inter-package references
 then there are security issues with that. If not, then why the UUID?

At the time of writing, I do not see them being used for inter-package  
references (If my understanding equals yours here, as in references  
between otherwise unrelated widgets.

The UUID? Well, it actually eases implementations a bit, since an  
implementation can use the UUID as domain when requests are made, which  
actually allows vendors to reuse existing infrastructure for security  
checks and so on.
-- 
Arve Bersvendsen

Opera Software ASA, http://www.opera.com/


MIME types for packaged content (was re: tag: uri scheme)

2009-02-13 Thread Larry Masinter
I think it would be much better to allow content types to be
derived by the packager and included in the package on
a file-by-file basis. This was the finding during the
development of MHTML many years ago, and the situation
isn't different here.

There are several operating systems in wide use today
which allow files without extensions to be sniffed
on the client. This would also allow the inclusion
of charset parameters in text types, which, in general,
cannot always be easily sniffed, even if they are
well known in the context of the packager.

While I'm dubious about the arguments on MIME type
sniffing for browsers, I think it's completely
unnecessary for packaged content, because of the
explicit package step necessary to provide 
conformant content.

(I changed the subject line because the topic isn't about
the 'tag:' URI scheme.)

Larry


-Original Message-
From: Marcos Caceres [mailto:marcosscace...@gmail.com] 
Sent: Friday, February 13, 2009 5:27 AM
To: Larry Masinter
Cc: Bjoern Hoehrmann; public-pkg-uri-scheme; WebApps WG
Subject: Re: tag: uri scheme

Hi Larry,

2009/1/22 Larry Masinter l...@acm.org:

  https://issues.apache.org/bugzilla/show_bug.cgi?id=13986

 Astounding. Thanks for that pointer, hadn't seen that history.

 Still, communication of a package is different than communication
 of individual components, because there's an explicit processing
 step which is create the package. Even if there might be
 some reasons why Apache hasn't fixed their configuration files,
 is there any reason to believe that create a package software
 couldn't be configured to always use well-known file extensions
 or (if allowed) well-known content-types?

Our current model in that we are thinking of putting into the widget
packaging spec is:

1. match the file extension to a mime type using the extension to MIME
type tables in the packaging spec.
2. if no match is made, then attempt to sniff the mime type.
3. if no match is made, label the file 'unknown/unknown'.

What we could do is add a step 0, where authors could have an
XML-based (or text based) format for declaring extension to MIME
(e.g., php - text/html) or overriding default extensions to MIME
mappings.

For example,
types xmlns=http://www.w3.org/ns/widgets;
  type ext=php mime=text/html/
  type ext=jsp mime=text/html/
  type ext=htm mime=application/xhtml+xml/
/types

The above elements could either just be part of the configuration
document, or could be in a separate file.

 I'll still claim that the closer you are to the origin of the
 data, the more likely you are going to be able to guess the
 context of the data.

probably true.

-- 
Marcos Caceres
http://datadriven.com.au


RE: tag: uri scheme

2009-01-21 Thread Larry Masinter

MIME multipart made an explicit decision to require explicit
content-type rather than rely on file extensions.   Other
serializations might have some default inference mechanism,
some way to extend the inference mechanism (e.g., by file
extension, as is necessary with ftp:), explicitly define a 
content-type (e.g., in some package metadata or per-file
metadata), or limit package content to a set of well known
content types.

I think these are all elements of the serialization choice;
the main idea is to look at the data model and the requirements
and make sure there's a consistent mapping.

I'm not sure about 'authoring might be more complicated',
though. The author/sender/creator of a package has a lot more
insight about the types of the components of the package
than the recipient, and if there's any guesswork to be
done, putting the burden on the author would seem to be more
stable and effective for the overall communication system.

Larry




-Original Message-
From: Boris Zbarsky [mailto:bzbar...@mit.edu] 
Sent: Wednesday, January 21, 2009 7:04 PM
To: Larry Masinter
Cc: Marcos Caceres; public-pkg-uri-sch...@w3.org; public-webapps@w3.org; Tim 
Kindberg
Subject: Re: tag: uri scheme

Larry Masinter wrote:
 Yes, using Zip is a different overall serialization than 
 MIME multipart, but aren't the problem spaces similar enough
 that differences from what is already widespread practice?

MIME multipart would have the side benefit of specifying MIME types.  At 
the same time, authoring might be a little more complicated...

-Boris



RE: tag: uri scheme

2009-01-21 Thread Larry Masinter

I disagree with the assertion that, for HTTP, by and large 
the whole thing doesn't work very well).

First, it usually isn't authors who personally assign MIME types
to anything. Content is written by software applications, usually,
and software applications generally are set to at least generate
file types or file extensions where the file extension is for the
locally appropriate file type -- otherwise the software wouldn't
function for the authors when they went to reopen the content.

MIME types are generally assigned by the HTTP servers, of which
Apache and IIS are the most popular. Perhaps you might want to
argue the number of Major web servers vs the number of 
Major browsers, but I think there are more browser instances
than there are server instances, and the statistics are actually
much more skewed as to sites and pages served (a small number
of sites are responsible for a large proportion of pages retrieved,
while it isn't true that a small number of browser users are
responsible for a large proportion of pages retrieved.)

This is important, because the difficulties experienced with
MIME type assignment are mainly ones of configuration, not
software capability. There were some earlier versions of Apache
that would serve unknown file extensions as text/plain instead
of application/octet-stream, but that was a configuration error.

In general, it is fruitless to write standards that try to mandate
behavior for software, organizations, or configurations which have
not previously followed standards, because there is no indication
whatsoever that they would follow the new standards any more than
the old ones; if you merely write standards to describe current
behavior, there's no guarantee that the current behavior won't
continue to drift, since the organizations involved have no more
incentive to keep to the new standards any more than they did the
old ones.

So the issue isn't authors, it is software that authors use,
and there's no reason to believe that package-generating
software would do any worse generating correct MIME types than
they would generating correct ZIP files.

Larry
-- 
http://larry.masinter.net


=
Larry Masinter wrote:
 I'm not sure about 'authoring might be more complicated',
 though. The author/sender/creator of a package has a lot more
 insight about the types of the components of the package
 than the recipient, and if there's any guesswork to be
 done, putting the burden on the author would seem to be more
 stable and effective for the overall communication system.

That strongly depends on the relative numbers of authors and recipients, 
their relative cluefulness, and their relative resource availability... 
  For example, if your authors will largely tend to get their MIME types 
wrong and there are lots of them and there are only three possible 
recipients, all of whom are willing to put in the sort of work the 
authors aren't, the tradeoff might lie on just having the recipients deal.

This is not to say that the tradeoff might not fall the other way too, 
but authoring certainly _is_ more complicated if authors have to choose 
their types themselves (witness HTTP, where by and large the whole thing 
doesn't work very well).  That might be a sacrifice that's worth it in 
the interests of other things, naturally.

-Boris




RE: [widgets] Minutes from 30 October 2008 Voice Conference

2008-11-22 Thread Larry Masinter

Resolving the general topic of ZIP-based packages and URI references within 
them on the webapp mailing list doesn't seem practical, because those who need 
to review the package/URI issue are likely not interested in wading through the 
mass of other email on other unrelated topics within WebAPP WG.

I don't understand why setting up a separate mail list/archive/issue list on 
the specific topic is a lengthy process, it mainly requires the will to take 
the need for coherence seriously.

If resolving this in a timely fashion is important to you (as you seem to 
indicate by invoking time scope) then perhaps you might want to respond more 
quickly.

Larry


-Original Message-
From: Marcos Caceres [mailto:[EMAIL PROTECTED]
Sent: Friday, November 21, 2008 3:29 AM
To: Larry Masinter
Cc: Arthur Barstow; Jon Ferraiolo; Richard Cohn; Bill McCoy; [EMAIL PROTECTED]; 
Michael Stahl; [EMAIL PROTECTED]; Svante Schubert; [EMAIL PROTECTED]; Philippe 
Le Hegaret; Carl Cargill; Stephen Zilles; [EMAIL PROTECTED]; public-webapps
Subject: Re: [widgets] Minutes from 30 October 2008 Voice Conference

Hi Larry,
On Fri, Oct 31, 2008 at 3:01 PM, Larry Masinter [EMAIL PROTECTED] wrote:
 Hi all,
 I think there is considerable interest in a broad community in the topic of
 ZIP based packages, specifically MIME types for them and intra-package URI
 references within them, and possibly for standardizing metadata as well.

 Procedurally, I don't think it is appropriate to attempt to resolve these
 issues in the WebAPP working group, if only because a number of the affected
 groups have little additional overlap with WebAPPS. I know the W3C TAG has
 discussed the URI issues at some point.  I'm not sure if the overhead of
 starting a new W3C working group focused specifically on this topic is too
 high, but if so, an IETF activity with W3C participation might be a way of
 getting broader participation, as well as getting additional IETF
 involvement in the MIME/URI issues.


Although I agree that starting an independent group might be a good
idea, I fear that the administrative overhead of getting everything
set up is beyond the time scope for the Widgets Work (which we want to
get to LC by end of this year). To keep the work moving forward, I
propose that interested parties continue to work with WebApps, through
our public mailing list, on the problem. We could continue to push for
an independent group and then migrate whatever gets done in WebApps to
a new group or spec.

WebApps will continue to work on the problem regardless. So I again
encourage people to work with us on the problem and put forward ideas
about how we could solve this.

Kind regards,
Marcos

--
Marcos Caceres
http://datadriven.com.au