Hi Marcos,

Great thanks for all the fixes!
I have seen also the further changes in the draft.

One more comment to this chunk:

"Authors need to keep path lengths below 250 bytes. Unicode code points can 
require more than one byte to encode a character, which can result in a path 
whose length is less than 250 characters."

The second sentence is actually not needed in this form.
I would drop it or change it as follows:

"Authors need to keep path lengths below 250 bytes. Unicode code points may 
require more than one byte to encode a character, which can result in a path 
whose length is less than 250 characters to be represented in more than 250 
bytes."

Thanks.

Kind regards,
Marcin

Marcin Hanclik
ACCESS Systems Germany GmbH
Tel: +49-208-8290-6452  |  Fax: +49-208-8290-6465
Mobile: +49-163-8290-646
E-Mail: [email protected]

-----Original Message-----
From: [email protected] [mailto:[email protected]] On Behalf Of 
Marcos Caceres
Sent: Monday, June 01, 2009 9:20 PM
To: Marcin Hanclik
Cc: [email protected]
Subject: Re: [widgets] P&C Last Call comments, zip-rel-path ABNF

On Mon, Jun 1, 2009 at 12:44 AM, Marcin Hanclik
<[email protected]> wrote:
> Error in ABNF:
> localized-folder vs. locale-folder

Fixed.

> Error with ABNF
> utf8-chars       = safe-chars / U+0080 and beyond
> "and beyond" does not fit here

right. What should be there is:

utf-8-chars      = %xA0-D7FF / %xF900-FDCF / %xFDF0-FFEF
                  / %x10000-1FFFD / %x20000-2FFFD / %x30000-3FFFD
                  / %x40000-4FFFD / %x50000-5FFFD / %x60000-6FFFD
                  / %x70000-7FFFD / %x80000-8FFFD / %x90000-9FFFD
                  / %xA0000-AFFFD / %xB0000-BFFFD / %xC0000-CFFFD
                  / %xD0000-DFFFD / %xE1000-EFFFD

> Section 2. of RFC2279 shows that all UTF-8 characters above U+0080 are 
> encoded with byte values over 0x80.
> So utf-8 production equals to cp437 production on the byte level within the 
> context that is important for us.
>

Correct, I think.

> So both productions can be equalized and removed, since allowed-char may be 
> used.

right.

> I think the problem is similar to this one about encoding (I just had a brief 
> look on it):
> http://lists.w3.org/Archives/Public/public-html/2009May/0643.html

Yes, they are just byte ranges.

> Error with ABNF
> cp437-chars      = safe-chars / x80-FF
> should be according to RFC2234:
> cp437-chars      = safe-chars / %x80-FF

Right, but safe chars does not cover the whole CP437 range.

> Due to many issues I would rewrite the whole ABNF as follows.
> ABNF issues, additionally to the above, are:
> 1. plural form used for just "one-of" value

Where?

> 2. the zip-rel-path may have problems with existence, since all productions 
> are optional. The below format seems equal and is
> shorter

See below.

> 3. the production of file-name is wrongly specified, since there 
> file-extension could appear up to 254 times in a file name
>

yes, that is wrong.

> 4. I am not sure whether the file extension could be more than 3 chars or not 
> in the existing ABNF?

Yes, it's at least 1 to many. It's not restricted to 3 and I'm not
sure why you are saying we should restrict it to 3?

> If so, the actual file name shall match 2 rules simultaneously, e.g.:
> file-name1       = 1*allowed-char [ "." 1*allowed-char ]
> file-name2       = 1*254 ( allowed-char )
> Matching of those 2 rules is not expressible in ABNF, so prose would be 
> needed.
>
> New ABNF (problem of file extension length as above still remains):
> **************
> A valid Zip relative path is one that case-insensitively matches the 
> production of Zip-rel-path in the following [ABNF] that
> operates on bytes, not on characters, i.e. after any encoding (CP437 or 
> UTF-8) has been applied:
>
> zip-rel-path     = [ locale-folder ] [ *folder-name ] [ file-name ]

Everything in the above is optional too... so it's the same problem...

> locale-folder    = "locales" "/" Language-Tag "/"
> folder-name      = file-name "/"
> file-name        = base-name [ file-extension ]
> file-extension   = "." 1*3 ( allowed-char )
>
> base-name        = 1*250( allowed-char )
> allowed-char     = safe-char / %x80-FF
> safe-char        = ALPHA / DIGIT / SP / "$" / "%"
>                    / "'" / "-" / "_" / "@"
>                    / "~" / "(" / ")" / "&" / "+"
>                    / "," / "." / "=" / "[" / "]"
> **************
>

Here is another crack at it, taking the bugs your found into
consideration. I also dropped the length restriction:

zip-rel-path   =  [ *folder-name ] file-name /
                         [ locale-folder ] 1*folder-name /
                          locale-folder [ *folder-name ] file-name
locale-folder  = "locales" "/" Language-Tag "/"
folder-name   = file-name "/"
file-name       = base-name [ file-extension ]
base-name     = 1*allowed-char
file-extension = "." 1*allowed-char
allowed-char   = safe-char / utf8-char
safe-char      = ALPHA / DIGIT / SP / "$" / "%"
                    / "'" / "-" / "_" / "@"
                    / "~" / "(" / ")" / "&" / "+"
                    / "," / "." / "=" / "[" / "]"
utf8-char      =  %x80-D7FF     / %xF900-FDCF   / %xFDF0-FFEF
                / %x10000-1FFFD / %x20000-2FFFD / %x30000-3FFFD
                / %x40000-4FFFD / %x50000-5FFFD / %x60000-6FFFD
                / %x70000-7FFFD / %x80000-8FFFD / %x90000-9FFFD
                / %xA0000-AFFFD / %xB0000-BFFFD / %xC0000-CFFFD
                / %xD0000-DFFFD / %xE1000-EFFFD


> Authors need to keep path lengths below 250 bytes. Unicode code points can 
> require more than one byte to encode, which can result in
> a path whose length is less than 250 characters.
> should be
> Authors need to keep path lengths below 250 bytes. Unicode code points may 
> require more than one byte to encode a character, which
> can result in a path whose length is less than 250 characters to be 
> represented in more than 250 bytes.

fixed.

> UTF8-chars
> should be
> utf8-chars or utf8-char or something new (after the ABNF is updated) .

Fixed.



--
Marcos Caceres
http://datadriven.com.au

________________________________________

Access Systems Germany GmbH
Essener Strasse 5  |  D-46047 Oberhausen
HRB 13548 Amtsgericht Duisburg
Geschaeftsfuehrer: Michel Piquemal, Tomonori Watanabe, Yusuke Kanda

www.access-company.com

CONFIDENTIALITY NOTICE
This e-mail and any attachments hereto may contain information that is 
privileged or confidential, and is intended for use only by the
individual or entity to which it is addressed. Any disclosure, copying or 
distribution of the information by anyone else is strictly prohibited.
If you have received this document in error, please notify us promptly by 
responding to this e-mail. Thank you.

Reply via email to