#895: MiscUtil: remove_html_markup() function should have an option to not 
remove
escaped characters
--------------------------+----------------------
  Reporter:  nkasioum     |      Owner:  nkasioum
      Type:  enhancement  |     Status:  in_merge
  Priority:  minor        |  Milestone:  v1.0
 Component:  MiscUtil     |    Version:
Resolution:               |   Keywords:
--------------------------+----------------------

Comment (by jcaffaro):

 In the context of BibIndex, where this function is used, it might be
 interesting not only to keep entity references (© €) but also to
 convert them to strings (©, €). The same would apply to numeric character
 references (Σ to Σ or ∫ to ∫).

 A parameter to switch on/off the conversion might be useful to add (though
 this kind of use case might be more appropriately handled in dedicated,
 more specialized piece of code. See for eg. how this is achieved using
 module {{{htmlentitydefs}}} in {{{handle_charref(..)}}} and
 {{{handle_entityref(..)}}} functions of
 source:modules/webcomment/lib/webcomment_washer.py@9a70ab3fc77b5#L43 or
 source:modules/webalert/lib/htmlparser.py@bd6f70ff7c2dee#L136)

-- 
Ticket URL: <http://invenio-software.org/ticket/895#comment:2>
Invenio <http://invenio-software.org>

Reply via email to