Matthew Woehlke wrote:
Actually, it might even make sense to implement \b as only matching
start/end and '[&?/.]'. That way matching path components (well, unless
the paths contain '.') is also "safe".
For those watching at home, we decided on IRC this is probably a bad
idea. Probably we'd get a regex engine that has \b already (maybe ERE
from grep), but that it would be confusing to have it mean something
different in the context of wget.
['d..q' versus 'd+p+q']
Micah voted for 'd-q'. I'm okay with this (still slightly partial to
'd..q' ;-), but 'd-q' solves the complaints I have with '+').
How, then, did you plan for 'fields' to be matched?
[also, do we allow 'd,q'?]
I voted for parsing fields as match against a list of strings (as
opposed to modifying the regex, probably by 's/.*/(.*\&)?&(\&.*)?/') as
the former seems safer, but this is an implementation detail. As such,
the former would allow 'd,q', but this seems sufficiently esoteric that
we don't feel a need to implement it unless someone has a plausible
use-case and convinces Miach that an equivalent regex is too hard to write.
[implicit anchors?]
Left to Micah's discretion as far as I am concerned; the leaning is
toward 'yes'. Other opinions?
Actually, this is interesting w.r.t. the first point... I don't think I
would consider '--match foo' and '--no-match (?!foo)' the same. Rather,
one is an accept rule (which happens to accept anything that doesn't
match 'foo'), and one is a reject rule. This is actually useful since it
lets you accept anything that «matches [list] AND matches [expr]».
Micah says:
<quote>
"--match foo" accepts everything that has foo in it, and isn't in the
reject lists, but if it doesn't have foo, it can still be accepted by
some other --match rule.
--no-match '(?!foo)' instantly rejects anything that doesn't contain
foo, and can't be overruled.
</quote>
(Which is to say we agree on this point.)
/Probably/ we will have a flag to invert a match (equivalent of
'(?!expr)') but Micah is "not totally commited" and "[m]ight add it
after the first iteration". But this also depends partly on if we have
available optional PCRE.
As for '(?!)', this is not valid in ERE, so would need libpcre to be
used. We could maybe use PCRE if available and only have ERE 'built in'
(might need a flag to specify an expression is PCRE). Consensus on this
was not reached, so this is still an open question.
--
Matthew
Please do not quote my e-mail address unobfuscated in message bodies.
--
I picked up a Magic 8-Ball the other day and it said 'Outlook not so
good.' I said 'Sure, but Microsoft still ships it.'
-- Anonymous (from cluefire.net)