On Thursday 30 September 2010 17:05:14 xor wrote: > On Friday 06 August 2010 10:25:32 pm Matthew Toseland wrote: > > On Tuesday 13 July 2010 21:12:27 xor wrote: > > > On Tuesday 02 February 2010 08:44:26 pm Matthew Toseland wrote: > > > > On Saturday 30 January 2010 18:11:14 xor wrote: > > > > > On Saturday 30 January 2010 17:19:49 Matthew Toseland wrote: > > > > > > I have several identities in my WoT that have invalid characters in > > > > > > their names, e.g.: <a%20href=... at eH3v1... > > > > > > <?)))))><@mvT-H... > > > > > > > > > > > > Etc. > > > > > > > > > > > > If we are not going to preemptively delete them, we need to guard > > > > > > against them - in particular, L10n.addL10nSubstitution() is *not* > > > > > > safe if the values are potentially unsafe, because they are treated > > > > > > as raw HTML. > > > > > > > > > > We disallow stuff like control characters and invalid UTF formatting. > > > > > Disallowing stuff just because HTML uses them is quite arbitrary > > > > > though because the FProxy HTML generation code should be able to > > > > > escape stuff, which it does. The test identities showed that it > > > > > works. > > > > > > > > > > L10n-substitutions in nicknames should not happen, but if you're > > > > > saying that input-strings for l10n-substitution is not safe (i.e. the > > > > > values the substituter should fill in) then something needs to be > > > > > done to make it safe. > > > > > > > > That is impossible *by definition*. L10n.addSubstitution is *INTENDED > > > > TO SUBSTITUTE HTML IN*. Hence replacing ${bold} with <b>. > > > > > > > > Whereas getString(name, keys, values) is of course safe because it > > > > generates text which is then added to HTML and escaped in the process > > > > (unless you use "%", but you shouldn't with a getString). > > > > > > > > > > I know some of these were removed recently but I'm not sure that > > > > > > all cases where we use it with a nickname have been. > > > > > > > > > > I think the way to go is to make all places which use nicknames safe > > > > > rather than forbidding HTML characters. Control-characters should be > > > > > forbidden but they already are. > > > > > > > > IMHO < should be forbidden, because it is an actual threat, whereas > > > > most of the other characters which you forbid are debatable. > > > > > > Then we would also have to forbid them in thread titles and texts... and > > > trust value comments... and and and... Makes no sense. > > > > > > We should rather make very sure that anything which reads strings from > > > the network does not do any harm... which we are doing already anyway > > > > > > Deleting this thread off my TODO list... > > > > So your solution is to HTMLEncoder.encode() every client supplied string > > before giving it to L10n.addSubstitution()? Or maybe we should make it > > only accept HTMLNode's? > > > > Yes, I guess we shouldn't be shoving any user data into addSubstitution > without sanitizing it. > > I guess I should review all calls to addSubstitution then. Will file a bug. > Are there any other dangerous l10n functions which I should check for? > > I have implemented a better, safer solution in master:
BaseL10n.addL10nSubstitution(HTMLNode, String, String[], String[]) is now deprecated. BaseL10n.addL10nSubstitution(HTMLNode, String, String[], HTMLNode[]) is the replacement. Because an HTMLNode is effectively a DOM tree node, it is not possible to represent closing a tag (e.g. </a>) as an HTMLNode. Thus you must pass in only the opening tags. The closing tags are inferred. E.g.: addL10nSubstitution(html, "TranslationLookup.string", new String[] { "link", "text" }, new HTMLNode[] { HTMLNode.link("/KSK at gpl.txt"), HTMLNode.text("blah") }) TranslationLookup.string=This is a ${link}link${/link} about ${text}. The HTMLNode's should be reused if possible (e.g. via constants), but they do get cloned if they contain anything. HTMLNode has some utility methods and constants. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 197 bytes Desc: This is a digitally signed message part. URL: <https://emu.freenetproject.org/pipermail/devl/attachments/20101027/f3c673a6/attachment.pgp>