https://bugs.documentfoundation.org/show_bug.cgi?id=141187
Tex2002ans <[email protected]> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |Tex2002ans+LibreOffice@gmai | |l.com --- Comment #6 from Tex2002ans <[email protected]> --- Yes, this is still an issue in: Version: 24.2.1.2 (X86_64) / LibreOffice Community Build ID: db4def46b0453cc22e2d0305797cf981b68ef5ac CPU threads: 8; OS: Windows 10.0 Build 22631; UI render: Skia/Raster; VCL: win Locale: en-US (en_US); UI: en-US Calc: CL threaded - - - I tested using BogdanB's attachment 173984 in comment 3. 0. Open file. 1. Add a random word or two inside the text. 2. File > Export As > Export as EPUB. 3. Press OK. 4. Unzip the EPUB and look inside the HTML. - Or use an EPUB editing program like Sigil or Calibre. You'll see extra <span>s in the EPUB: > Like lightning he darted off to the left and disappeared between the two > warehouses almost falling over the trash can lying în </span><span > class="span2">my</span><span class="span2"> the middle of the sidewalk. He > tried to nervously tap his way along in </span><span class="span2">my > </span><span class="span2">the inky darkness and suddenly stiffened: with this blank class in the EPUB's CSS: > .span2 { > } = = = = = = = = = = = = I believe part of the root cause is spurious: - officeooo:rsid inside the ODT file, which get carried over into the HTML/EPUB export. (I believe these RSIDs are "Random Session IDs"—to know when a certain text was edited for Comparison / Tracked Changes reasons.) - - - If you take the ODT and: - File > Save As - Dropdown for "Save as Type:" - Choose "Flat XML ODF Text Document" You can open the FODT up in a text editor and see code along these lines: > <text:p text:style-name="P1">He heard quiet steps behind him. [...] almost > falling over the trash can lying în <text:span > text:style-name="T1">my</text:span> the middle of the sidewalk. He tried to > nervously tap his way along in <text:span text:style-name="T2">my > </text:span>the inky darkness and suddenly stiffened: it was a dead-end, [...] where extra <text:span>s appear around everything you insert/edit. Higher in the FODT document, you can see what "T1" and "T2" were equivalent to: > <style:style style:name="T1" style:family="text"> > <style:text-properties officeooo:rsid="00019890"/> > </style:style> > <style:style style:name="T2" style:family="text"> > <style:text-properties officeooo:rsid="0003570a"/> > </style:style> The only thing these <text:span>s were there for was: - officeooo:rsid they didn't supply any other info. - - - There was a similar issue with "single URLs" getting split into "multiple identical ones" here: - Bug #112429 : "officeooo:rsid multiplies the links" - Bug #148198 : "Editing single hyperlink breaks it into smaller ones" - Which got fixed in 7.5.0 and 7.4.0.2. Mike Kaganski then came up with a patch to "merge identical hyperlinks of adjacent text ranges on ODF export": - https://bugs.documentfoundation.org/show_bug.cgi?id=148198#c19 = = = = = = = = = = = = So, on EPUB Export, I would probably do some logic along these lines: Case 1: Before - If ODT's "text:span text:style-name" only has "officeooo:rsid": - Do not export this <span> to EPUB at all. - If 2 "text:spans" are right next to each other and the only difference is "officeooo:rsid". - Merge them together before HTML/EPUB export. - Similar to Bug 148198 above! Case 2: After You could have a pass that says: - If the CSS class is empty/blank on the other end: - Delete that <span> out of the HTML/CSS/EPUB export completely. = = = = = = = = = = = = Note 1: Calibre's EPUB Editor has a fantastic feature called: - "Remove Unused CSS" - https://manual.calibre-ebook.com/edit.html#removing-unused-css-rules which can do this type of thing in one button push: - Tools > "Remove unused CSS" It: - Finds and purges all CSS and related HTML tags that that are blank / not in use making the leftover HTML *much* easier to work with. - - - Note 2: I've also written many topics about this type of HTML+CSS cleanup over the years. Most recently: 2023: "Nested span, clean" - https://www.mobileread.com/forums/showthread.php?p=4342160#post4342160 2023: "removing excessive <class> and other formatting horrors on epub" - https://www.mobileread.com/forums/showthread.php?p=4312194#post4312194 2022: "Convert text formating from CSS to HTML" - https://www.mobileread.com/forums/showthread.php?p=4188132#post4188132 -- You are receiving this mail because: You are the assignee for the bug.
