https://bugs.documentfoundation.org/show_bug.cgi?id=144050
--- Comment #7 from Bernard Moreton <[email protected]> --- The uploaded example file is a pared-down extract from a much longer PDF report file, with most of the actual text replaced character-for-character, for obvious discretionary reasons. The PDF was reduced to text using pdftotext -layout $src # $src being the PDF file A standard RTF header block is then written, with the mandatory {\rtf1\ansi followed by a brief FONTTBL, COLORTBL (probably redundant), and a single style in the STYLESHEET. I now follow that with the {\*\generator LibreOffice/7.1.5.2$Linux_X86_64 LibreOffice_project/10$Build-2} to stop the unwanted behaviour of appending the strange characters in multi-soace strings. Then the lines defining the papersize, margins, and orientation for the document and the section (the latter again probably redundant), and finally the "\pard\plain \s7" to start the body of the text. The text is then copied from the text file, adding a "\line" at each line-end. And finally the RTF ending is added, "}" I'd upload the BASH executable, but the source RTF already uploaded shows the process more clearly than the BASH script could do! I've been using this sort of method for many years for reporting from 4GL, whether simply to LO (and OOo before that), or using LO to create a PDF from the command line - though in 4GL reporting most of the formatting is done by defining tabs. When processing pre-formatted text, however, especially from the output of PDFTOTEXT, multiple spaces are unavoidable; but they should *never* be added to with strange characters as the LO FILEOPEN for RTF obviously does. -- You are receiving this mail because: You are the assignee for the bug.
