Benoit Tellier created JAMES-4061: ------------------------------------- Summary: Html Text extractor needs to handle blockquote Key: JAMES-4061 URL: https://issues.apache.org/jira/browse/JAMES-4061 Project: James Server Issue Type: Bug Components: JMAP Affects Versions: master Reporter: Benoit Tellier Assignee: Antoine Duprat Attachments: image-2024-08-22-14-54-37-915.png, image-2024-08-22-14-54-51-684.png, image-2024-08-22-14-55-01-317.png
Following recent mailing list exchanges, Wojtek contacted me privatly to notice me about the bad idents of my inlined ansers. The exchange: https://www.mail-archive.com/server-dev@james.apache.org/msg74362.html Set up: I used Twake mail client throughout the discussion which produces html and relies on James server JMAP code for generating the text/plain part. Wojtek favors reading text plain when available. Full diagnostic is taken from a private conversation: h3. Diagnostic I bet this is a plain text projection of the email that screwed up. HTML version looks fine !image-2024-08-22-14-54-37-915.png! Which matched the output I see in my sent mails in Twake mail !image-2024-08-22-14-54-51-684.png! However indeed the text plain version is missing one level !image-2024-08-22-14-55-01-317.png! What we have >> Your initial concern > My initial answer Your answer My answer to your answer What we should have >>> Your initial concern >> My initial answer > Your answer My answer to your answer Where it gets annoying it is that our Webmail ( https://github.com/apache/james-project ) generates an HTML output (WYSIWYG) and the backend then extract the text from the HTML in order to present a text/plain view of the message and the <blockquote> tags are currently ignored. The component converting HTML to text needs to account for these blockquotes, actually keep track of the count of blockquotes of the curent context and replace line breaks by the appropriate count of blockquotes <blockquote><p>abc</p><p>def<br/>ghi<p><blockquote><p>jkl</p><p>mno<br/></p></blockquote><p>pqr</p></blockquote><p>stu</p> Shall be replaced with > abc > def > ghi >> jkl >> mno > pqr stu The involved component is a JMAP utility of Apache James: org.apache.james.jmap.utils.JsoupHtmlTextExtractor -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org