Using mu from git master @ ab5830 (0.9.9.6), I would like to filter out 
unwanted
characters from mu's rendering of HTML email. Without resorting to running 
an
external html to text process for each message view buffer [1] [2], is 
there a
way for mu a) be more aggressive with its own filter, and b) filter and/or 
map
unwanted characters to an accepted set for display?

Many but not all HTML-format emails from correspondents carry these visual
artifacts. The most common occurrence is in their citation (reply, forward)
header block, and footer blocks. An example, presuming email preserves 
hat-H and
other characters:

---

F^HFr^Hro^Hom^Hm:^H: First Last
S^HSe^Hen^Hnt^Ht:^H: Thursday, July 03, 2014 7:46 AM
T^HTo^Ho:^H: First Last
C^HCc^Hc:^H: First Last
S^HSu^Hub^Hbj^Hje^Hec^Hct^Ht:^H: The subject

The body text of most HTML messages shows relatively cleanly in mu with 
stock
configuration. Some messages or parts thereof are unreadable with control
characters and punctuation unicode:

_^HA_^Hd_^Hd_^H _^Ha_^Hl_^Hl_^H _^HG_^Ho_^Ha_^Hl_^Hs(wrap)
_^H _^Ht_^Ho_^H _^HC_^Ha_^Hl_^He_^Hn_^Hd_^Ha_^Hr
By clicking above, the following goals will be added to your calendar.
========================================================================
_^HI_^Hm_^Hp_^Hl_^He_^Hm_^He_^Hn_^Ht_^H _^HC_^Ho_^Hn_^Hn_^He_^Hc_^H(wrap)
t_^Ho_^Hr_^H _^H-_^H _^HQ_^H3
Due Date: S^HSe^Hep^Hp,^H, 3^H30^H0 2^H20^H01^H14^H4
========================================================================
_^HL_^Ha_^Hu_^Hn_^Hc_^Hh_^H _^HE_^Hx_^Hc_^Hh_^Ha_^Hn_^Hg_^He.
Due Date: S^HSe^Hep^Hp,^H, 3^H30^H0 2^H20^H01^H14^H4
========================================================================
_^HF_^Hi_^Hn_^Ha_^Hl_^Hi_^Hz_^He_^H _^HP_^Hl_^Ha_^Hn_^H _^Hf_^Ho_^Hr(wrap)
_^H ^H _^HS_^Hi_^Ht_^He_^Hs
Due Date: S^HSe^Hep^Hp,^H, 3^H30^H0 2^H20^H01^H14^H4
_^HU_^Hn_^Hs_^Hu_^Hb_^Hs_^Hc_^Hr_^Hi_^Hb_^He_^H _^Hf_^Hr_^Ho_^Hm_^H (wrap)
_^Ha_^Hl_^Hl_^H _^Hc_^Ha_^Hl_^He_^Hn_^Hd_^Ha_^Hr_^H _^Hi_^Hn_^Hv_^H(wrap)
i_^Ht_^He_^Hs | _^Hu_^Hp_^Hd_^Ha_^Ht_^He_^H _^Hn_^Ho_^Ht_^Hi_^Hf_^H(wrap)
i_^Hc_^Ha_^Ht_^Hi_^Ho_^Hn_^H _^Hp_^Hr_^He_^Hf_^He_^Hr_^He_^Hn_^Hc_^He_^Hs

Footer block punctuation unicode characters:

T: +1 123 456 78900\302^H^H\240\302^H^H\240\302^H^H\240M: +1 123 456 67890

---

Thanks,
Jeff

[1] 
http://www.djcbsoftware.nl/code/mu/mu4e/Displaying-rich_002dtext-messages.html

[2] Given sufficient time to configure a solution, it might be nice to run
html2text over my maildir to append a text version to HTML-only messages,
leaving the HTML original intact. Has anyone described a method for doing 
this
non-destructively, i.e. conservative with good bailout on poorly-formed 
email?

-- 
You received this message because you are subscribed to the Google Groups 
"mu-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to