There are many more valid chars for filenames within urls than this
though, which makes the whole point moot I think.
At the very least I know you need (I have used all of the following
fairly recently):
[a-zA-Z0-9_-%.?&=]
at which point you might as well simply define it as the chars that are
not allowed:
[^'")]
As I understand though, these are allowed if you are properly following
the spec:
|URI url\({w}{string}{w}\) |
url\({w}([!#$%&*-~]|{nonascii}|{escape})*{w}\)
nonascii [^\0-\177]
escape {unicode}|\\[^\n\r\f0-9a-f]
unicode \\[0-9a-f]{1,6}(\r\n|[ \n\r\t\f])?
string {string1}|{string2}
string1 \"([^\n\r\f\\"]|\\{nl}|{escape})*\"
string2 \'([^\n\r\f\\']|\\{nl}|{escape})*\'
nl \n|\r\n|\r|\f
w [ \t\r\n\f]*
||http://www.w3.org/TR/CSS21/syndata.html#tokenization
|which means that the true regex should be:
(?:url\([
\t\r\n\f]*(?:(?:"(?<Url>(?:[^\n\r\f\\"]|\\(?:\n|\r\n|\r|\f)|(?:\\[0-9a-f]{1,6}(?:\r\n|[
\n\r\t\f])?))*)")|(?:'(?<Url>(?:[^\n\r\f\\']|\\(?:\n|\r\n|\r|\f)|(?:\\[0-9a-f]{1,6}(?:\r\n|[
\n\r\t\f])?))*)'))[ \t\r\n\f]*\))|(?:url\([
\t\r\n\f]*(?<Url>(?:[!#$%&*-~]|[^\0-\177]|(?:(?:\\[0-9a-f]{1,6}(?:\r\n|[
\n\r\t\f])?)|\\[^\n\r\f0-9a-f]))*)[ \t\r\n\f]*\))
However I know that this is in fact wrong because simple unquoted urls
fail to match simple plain text filenames (meaning either the spec has a
bug here or I am incorrectly translating it to a regex, I suspect the
spec is wrong because it really glosses over the whole parsing section
much more than it should, it also conflicts with the url section of the
spec: http://www.w3.org/TR/CSS21/syndata.html#uri).
Tim Barcz wrote:
> Sorry...need to put in context
>
> [a-z|A-Z|/|\.|-]
> (and I find I mistyped it) in the above character class [a-z|A-Z] =>
> [a-zA-Z]
>
> Sorry,
>
> Tim
>
> On Fri, Dec 18, 2009 at 5:56 AM, Ken Egozi <[email protected]
> <mailto:[email protected]>> wrote:
>
> [a-z][A-Z] => [a-zA-Z]
> you sure?
> the first will match two letters, first is locase, second is capital
> while the second will match a single letter, either locase or capital.
>
>
>
> On Fri, Dec 18, 2009 at 8:40 AM, Tim Barcz <[email protected]
> <mailto:[email protected]>> wrote:
>
> What about data scheme?
>
>
> On Thu, Dec 17, 2009 at 11:52 PM, Bill Barry
> <[email protected] <mailto:[email protected]>> wrote:
>
> I would use one of the following (altering them some to
> use in .net strings of course):
>
>
> ^.*?url\(\s*(?<quote>["']?)(?<Url>(?!https?:|/)[^"')]+?)\k<quote>\s*\).*?$
>
> options: ignorecase, multiline
>
> or
>
> url\(\s*(?<quote>["']?)(?<Url>(?!https?:|/)[^"')]+?)\k<quote>\s*\)
>
> options: ignorecase
> depending on whether you want the whole line or not; the
> second is probably better because you technically can have
> more than one url on a line:
> UL { background-image: url(shadow-c.png);
> list-style-image: url(bullet.png); } /* perfectly valid
> css rule, under the first case you would capture only one
> of the urls, but the second you could see both */
>
> These regexes may still be missing some valid urls and
> capturing some invalid ones because I am pretty sure there
> is some sophisticated escaping rules in play for such urls
> which I am outright ignoring..
>
> testcases (including ones that fail previous posted regexes):
> background-image: url(images/default/shadow-c.png); /*valid*/
> background-image: url(shadow-c.png); /*valid*/
> background-image: url(../images/default/shadow-c.png);
> /*valid*/
> background-image: url('../images/icons/file-xslx.gif')
> !important; /*valid*/
> background-image: url("../images/icons/file-xslx.gif")
> !important; /*valid*/
> background-image: url('http-header.gif') !important; /*valid*/
> background-image: url('http_header.gif') !important; /*valid*/
> background-image: url( 'http_header.gif') !important;
> /*valid*/
> background-image: url( 'http_header.gif' ) !important;
> /*valid*/
> background-image: url ( 'http_header.gif' ) !important;
> /*space between url and ( not valid [at least according to
> firefox 3.5]*/
> background-image: url('../images/icons/file-xslx.gif")
> !important; /*non-matching quotes*/
> background-image: url(../images/icons/file-xslx.gif")
> !important; /*missing start quote, might still be valid
> depending on char escaping rules to look for the file
> 'file-xlsx.gif"'*/
> background-image: url("../images/icons/file-xslx.gif)
> !important; /*missing end quote*/
> background-image: url(/images/icons/file-xslx.gif)
> !important; /*absolute url*/
> background-image: url('/images/icons/file-xslx.gif')
> !important; /*absolute url*/
> background-image: url("/images/icons/file-xslx.gif")
> !important; /*absolute url*/
> background-image:
> url('http://example.com/images/icons/file-xslx.gif')
> !important; /*absolute url*/
> background-image:
> url(http://example.com/images/icons/file-xslx.gif)
> !important; /*absolute url*/
>
>
> Additional testcases I didn't bother with:
>
> background-image: url( \(.gif ) !important; /*valid,
> filename = (.gif */
> background-image: url( \).gif ) !important; /*valid,
> filename = ).gif */
> background-image: url( \'.gif ) !important; /*valid,
> filename = '.gif */
> background-image: url( \".gif ) !important; /*valid,
> filename = ".gif */
> background-image: url( \ .gif ) !important; /*valid,
> filename = " .gif" */
> background-image: url( \
> .gif ) !important; /*valid (newline is part of the
> filename) only when served with unix line endings
> (filename would be invalid in windows line endings because
> not both \r and \n are escaped here)*/
> background-image: url(
> a.gif ) !important; /*valid (newline is not part of the
> filename)*/
> background-image: url( '\(.gif' ) !important; /*valid,
> filename = \(.gif */
> background-image: url( '\).gif' ) !important; /*valid,
> filename = \).gif */
> background-image: url( '\'.gif' ) !important; /*valid,
> filename = \'.gif */
> background-image: url( '\".gif' ) !important; /*valid,
> filename = \".gif */
> background-image: url( '\ .gif' ) !important; /*valid,
> filename = "\ .gif"*/
> background-image: url( '\
> .gif' ) !important; /*valid (newline and \ are part of the
> filename, filename would be \\\n.gif if served with unix
> line endings, \\\r\n.gif with windows line endings)*/
>
>
> These are valid css rules, assuming that the filename is a
> valid URI according to http://www.ietf.org/rfc/rfc3986
> after css has taken care of the escape chars. Developing a
> correct regex for rfc 3986 is a job suited only for a
> regex engine like that of Perl 6 (it is a
> non-deterministic context-sensitive grammar which makes it
> unsuited for any regex language that has comparable
> capabilities to Perl 5).
>
>
> James Curran wrote:
>> Oops.. Sorry, insufficent test cases...
>>
>> These should match as well.
>>
>> background-image: url(images/default/shadow-c.png);
>> background-image: url(shadow-c.png);
>>
>> Also, the url itself needs to be placed into a named capture
>> called "Url".
>>
>> Also, as a style note, in your patterns, you've written
>> "[a-z|A-Z|/|\.|-]". Inside the group brackets, the "or" is
>> assumed.
>> That should be [a-zA-Z/\.-]. As you wrote it, it would match a
>> literal verticle pipe character, which would be wrong.
>>
>> On Thu, Dec 17, 2009 at 2:45 PM, Leonardo Lima
>> <[email protected]> <mailto:[email protected]>
>> wrote:
>>
>>> Hi,
>>>
>>> Here I tested at Regex Buddy and worked only for the 3 first
>>> entries, sorry
>>> I think that I don´t understand your question...
>>>
>>
>
> --
>
> You received this message because you are subscribed to
> the Google Groups "Castle Project Development List" group.
>
> To post to this group, send email to
> [email protected]
> <mailto:[email protected]>.
> To unsubscribe from this group, send email to
> [email protected]
> <mailto:castle-project-devel%[email protected]>.
> For more options, visit this group at
> http://groups.google.com/group/castle-project-devel?hl=en.
>
>
>
>
> --
> Tim Barcz
> Microsoft C# MVP
> Microsoft ASPInsider
> http://timbarcz.devlicio.us
> http://www.twitter.com/timbarcz
>
> --
>
> You received this message because you are subscribed to the
> Google Groups "Castle Project Development List" group.
>
> To post to this group, send email to
> [email protected]
> <mailto:[email protected]>.
> To unsubscribe from this group, send email to
> [email protected]
> <mailto:castle-project-devel%[email protected]>.
> For more options, visit this group at
> http://groups.google.com/group/castle-project-devel?hl=en.
>
>
>
>
> --
> Ken Egozi.
> http://www.kenegozi.com/blog
> http://www.delver.com
> http://www.musicglue.com
> http://www.castleproject.org
> http://www.idcc.co.il - הכנס הקהילתי הראשון למפתחי דוטנט - בואו
> בהמוניכם
>
> --
>
> You received this message because you are subscribed to the Google
> Groups "Castle Project Development List" group.
> To post to this group, send email to
> [email protected]
> <mailto:[email protected]>.
> To unsubscribe from this group, send email to
> [email protected]
> <mailto:castle-project-devel%[email protected]>.
> For more options, visit this group at
> http://groups.google.com/group/castle-project-devel?hl=en.
>
>
>
>
> --
> Tim Barcz
> Microsoft C# MVP
> Microsoft ASPInsider
> http://timbarcz.devlicio.us
> http://www.twitter.com/timbarcz
>
> --
>
> You received this message because you are subscribed to the Google
> Groups "Castle Project Development List" group.
> To post to this group, send email to
> [email protected].
> To unsubscribe from this group, send email to
> [email protected].
> For more options, visit this group at
> http://groups.google.com/group/castle-project-devel?hl=en.
--
You received this message because you are subscribed to the Google Groups
"Castle Project Development List" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to
[email protected].
For more options, visit this group at
http://groups.google.com/group/castle-project-devel?hl=en.