It appears we have a systematic thinko affecting the filter...
We need to rename HTMLEncoder to URLEncoder, and add a real HTMLCoder
class (with encode() and decode()). And obviously change all uses
appropriately, including adding HTMLEncoder and HTMLDecoder when dealing
with URIs (any strings or just URIs?) in the filter. The first thing to
do is change HTMLEncoder to URLEncoder, put in prototypes for HTMLCoder,
and change all the uses accordingly. I'll get on with this this evening;
if you have an HTMLCoder class or something similar that is GPL and can
be slotted in, that would be very nice, otherwise I'll get around to
implementing one from the spec in a few hours time (I have to eat, and
I'm sure there are other bugs to look at).
On Thu, Jan 30, 2003 at 04:33:08AM -0800, Yves Lempereur wrote:
>
> So, since a lot of authors are complaining about the filter, I decided
> to take a look at SaferFilter. It looks like it should be pretty easy
> to add support for more stuff (including XHTML). However, before I add
> anything, there are quite a few bugs that really should be fixed. I
> didn't know if I should just go ahead and attempt to fix them or if I
> should just report them to you. Please advise.
>
> In the meantime, here are a few examples you can take care of right
> away:
>
> Line 921:
> Probably should be:
> if (s.equals("rtl") || s.equals("ltr")) {
>
> Line 1270:
> Probably should be:
> hn.put("http-equiv", "content-type");
> hn.put("content", typesplit[0] + (typesplit[1] != null ? ";charset="
> + typesplit[1] : ""));
>
> Line 1370:
> White space is most often present after the semicolon...
>
> Line 1380:
> Couldn't you just use trim()?
>
> Lines 1394, 1400 & 1406:
> Protocol names are not case sensitive
>
> Line 1418:
> Any URI starting with a / and containing a ? will return null (Greg
> Wooledge's NIM is being nuked by this one).
>
> Line 1453:
> Since the URI came from an HTML document, the token is likely to start
> with "amp;" (except for the first one, of course). In fact, in XHTML,
> it's illegal not to.
>
> Lines 1497, 1502 & 1506:
> The values might require URL encoding.
>
> Lines 1498, 1503 & 1507:
> Since it's going back into an HTML document, it really should be
> "&".
>
> Yves
>
> P.S. You should probably start with the last one and work your way back
> up (line numbers).
> P.P.S I hope I'm not turning out to be a pain in your neck. :-)
>
> On Monday, January 27, 2003, at 08:43 PM, Matthew Toseland wrote:
>
>
> > On Mon, Jan 27, 2003 at 07:42:40PM -0800, Yves Lempereur wrote:
> >
> > >
> > > It still doesn't work as of 550. I just re-inserted the following
> > > so
> > > that you can check it out:
> > >
> > > freenet:KSK at charset-test
> > >
> >
> > Hmm. Current snapshot has a bugfix; this bugfix causes it to at
> > least
> > send the correct Content-Type. The first three characters on the
> > second
> > line work when I telnet to the port and make the request
> > manually... all
> > the browsers I have tried f*ck them up for some reason. Please
> > confirm
> > or deny that this works...
> >
> > >
> > > Also, please note how much this 100% correct XHTML file gets
> > > messed up
> > > by the filter...
> > >
> > Submit a patch. But make sure it's really paranoid. Proper support
> > for
> > XHTML etc is low priority at the moment.
> >
> > >
> > > Yves Lempereur
> > >
> > > On Monday, January 27, 2003, at 06:56 PM, Matthew Toseland
> > > wrote:
> > >
> > >
> > > > On Fri, Dec 20, 2002 at 09:53:54PM -0800, Yves Lempereur wrote:
> > > >
> > > > >
> > > > > >
> > > > > > > While Fred can decode the following meta data:
> > > > > > >
> > > > > > > Info.Format=text/plain
> > > > > > > Info.Format=text/html
> > > > > > >
> > > > > > > It chokes on the following (unknown mime type warning):
> > > > > > >
> > > > > > > Info.Format=text/plain; charset=iso-8859-1
> > > > > > > Info.Format=text/html; charset=utf-8
> > > > > > >
> > > > > > > It seems to me that it would be desirable and fairly easy
> > > > > > > to fix.
> > > > > > >
> > > > > >
> > > > > > Remove the space, and use the development branch of Fred.
> > > > > >
> > > > >
> > > > > It no longer triggers the warning, but it doesn't actually
> > > > > work.
> > > > > Here is a test file, the two lines should look alike:
> > > > >
> > > > > freenet:KSK at charset-test
> > > > >
> > > > > The file works as expected locally and coming from Apache,
> > > > > but it
> > > > > doesn't work coming from Freenet.
> > > > >
> > > > Could somebody verify that this works now?
> > > >
> > > > >
> > > > > Yves Lempereur
> > > > >
> > > >
> > >
> > >
> > >
> >
> > --
> > Matthew Toseland
> > toad at amphibian.dyndns.org/amphibian at users.sourceforge.net
> > Full time freenet hacker.
> > http://freenetproject.org/
> > Freenet Distribution Node (temporary) at
> > http://amphibian.dyndns.org:8889/MtYsGntz~ic/
> > ICTHUS.
> > <mime-attachment>
>
>
--
Matthew Toseland
toad at amphibian.dyndns.org/amphibian at users.sourceforge.net
Full time freenet hacker.
http://freenetproject.org/
Freenet Distribution Node (temporary) at
http://amphibian.dyndns.org:8889/xMDOZ7aKUlM/
ICTHUS.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL:
<https://emu.freenetproject.org/pipermail/devl/attachments/20030130/707e3497/attachment.pgp>