#845: WebComment: fix handling of links in comments
------------------------+----------------------
 Reporter:  jcaffaro    |      Owner:  rajimene
     Type:  defect      |     Status:  new
 Priority:  major       |  Milestone:
Component:  WebComment  |    Version:
 Keywords:              |
------------------------+----------------------
 Commit 9a70ab3fc77b54274924a01933e0080d6ec34f26 introduced a refined
 handling of the content of comments, especially for links inside comments
 (see specifically comment:4:ticket:764). However the new behaviour fails
 at correctly processing links in all cases.

 For eg. with the following input (using
 
source:modules/webcomment/lib/webcomment_washer.py@9a70ab3fc77b54274924a01933e0080d6ec34f26):

 {{{
 msg = 'http://foo.com some more text'
 from invenio.webcomment_washer import EmailWasher
 washer = EmailWasher()
 washer.wash(msg)
 Out[5]: ''
 }}}

 producing an empty output. When HTML tag is used, we get a more
 satisfactory result:

 {{{
 msg = '<a href="http://foo.com";>bar</a> some more text'
 from invenio.webcomment_washer import EmailWasher
 washer = EmailWasher()
 washer.wash(msg)
 Out[5]: '<http://foo.com>bar some more text'
 }}}

 (though it might be cleaner to get '{{{<http://foo.com> (bar) some more
 text}}}')

 The same happens with the updated "generic" HTML washer
 
(source:modules/miscutil/lib/htmlutils.py@9a70ab3fc77b54274924a01933e0080d6ec34f26):
 {{{
 msg = 'http://foo.com some more text'
 from invenio.htmlutils import HTMLWasher
 washer = HTMLWasher()
 washer.wash(msg)
 Out[5]: ''
 }}}

 (not ok, but:)

 {{{
 msg = '<a href="http://foo.com";>bar</a> some more text'
 from invenio.htmlutils import HTMLWasher
 washer = HTMLWasher()
 washer.wash(msg)
 Out[5]: '<a href="http://foo.com";>bar</a> some more text'
 }}}

 (ok as we are not trying to make the link nicely viewable in non-HTML
 context)

 Note that WebComment uses:
  * {{{EmailWasher}}} for sending email to people subscribed to discussion
  * {{{EmailWasher}}} for sending email to admins/moderators
  * {{{HTMLWasher}}} to display comment/reviews on the discussion pages
 (through function {{{webmessage_mailutils.email_quoted_txt2html(...)}}})
  * {{{HTMLWasher}}} to display comment in the rich text editor when
 replying to a comment (through function
 {{{webmessage_mailutils.email_quoted_txt2html(...)}}})

 Note that {{{EmailWasher}}} is a subclass of {{{HTMLWasher}}}, inheriting
 some of its behaviour. Also note that other modules make use of
 {{{HTMLWasher}}}, such that any change should be carefully checked with
 them: BibFormat, WebMessage, WebJournal (+ others?)

 Several options/actions:
  * Fix the handling of URLs so that they do not disappear from the output
 (when not in href="" context).
  * Maybe add automatic transformation of {{{http://foo.com}}} urls to
 {{{<a href="http://foo.com";>}}} in HTMLWasher, with an option to turn the
 feature on/off.
  * Carefully check any change with all the modules using these classes.
  * Add unit/regression tests

-- 
Ticket URL: <http://invenio-software.org/ticket/845>
Invenio <http://invenio-software.org>

Reply via email to