#845: WebComment: fix handling of links in comments
------------------------+----------------------
Reporter: jcaffaro | Owner: rajimene
Type: defect | Status: new
Priority: major | Milestone:
Component: WebComment | Version:
Keywords: |
------------------------+----------------------
Commit 9a70ab3fc77b54274924a01933e0080d6ec34f26 introduced a refined
handling of the content of comments, especially for links inside comments
(see specifically comment:4:ticket:764). However the new behaviour fails
at correctly processing links in all cases.
For eg. with the following input (using
source:modules/webcomment/lib/webcomment_washer.py@9a70ab3fc77b54274924a01933e0080d6ec34f26):
{{{
msg = 'http://foo.com some more text'
from invenio.webcomment_washer import EmailWasher
washer = EmailWasher()
washer.wash(msg)
Out[5]: ''
}}}
producing an empty output. When HTML tag is used, we get a more
satisfactory result:
{{{
msg = '<a href="http://foo.com">bar</a> some more text'
from invenio.webcomment_washer import EmailWasher
washer = EmailWasher()
washer.wash(msg)
Out[5]: '<http://foo.com>bar some more text'
}}}
(though it might be cleaner to get '{{{<http://foo.com> (bar) some more
text}}}')
The same happens with the updated "generic" HTML washer
(source:modules/miscutil/lib/htmlutils.py@9a70ab3fc77b54274924a01933e0080d6ec34f26):
{{{
msg = 'http://foo.com some more text'
from invenio.htmlutils import HTMLWasher
washer = HTMLWasher()
washer.wash(msg)
Out[5]: ''
}}}
(not ok, but:)
{{{
msg = '<a href="http://foo.com">bar</a> some more text'
from invenio.htmlutils import HTMLWasher
washer = HTMLWasher()
washer.wash(msg)
Out[5]: '<a href="http://foo.com">bar</a> some more text'
}}}
(ok as we are not trying to make the link nicely viewable in non-HTML
context)
Note that WebComment uses:
* {{{EmailWasher}}} for sending email to people subscribed to discussion
* {{{EmailWasher}}} for sending email to admins/moderators
* {{{HTMLWasher}}} to display comment/reviews on the discussion pages
(through function {{{webmessage_mailutils.email_quoted_txt2html(...)}}})
* {{{HTMLWasher}}} to display comment in the rich text editor when
replying to a comment (through function
{{{webmessage_mailutils.email_quoted_txt2html(...)}}})
Note that {{{EmailWasher}}} is a subclass of {{{HTMLWasher}}}, inheriting
some of its behaviour. Also note that other modules make use of
{{{HTMLWasher}}}, such that any change should be carefully checked with
them: BibFormat, WebMessage, WebJournal (+ others?)
Several options/actions:
* Fix the handling of URLs so that they do not disappear from the output
(when not in href="" context).
* Maybe add automatic transformation of {{{http://foo.com}}} urls to
{{{<a href="http://foo.com">}}} in HTMLWasher, with an option to turn the
feature on/off.
* Carefully check any change with all the modules using these classes.
* Add unit/regression tests
--
Ticket URL: <http://invenio-software.org/ticket/845>
Invenio <http://invenio-software.org>