Re: Document encoding

Keisuke Miyako via 4D_Tech Mon, 13 Jan 2020 20:00:42 -0800

alternative solutions for guessing plain text encoding

https://opensource.google/projects/ced


also

https://github.com/miyako/4d-plugin-text-convert


$err:=CP Get good encodings ($euc;$codepages)


$err:=ICU Get good encodings ($euc;$encodings;$languages;$confidences)

but I agree with Koen, to request UTF-8 “encoding” for PDF seems like a 
misunderstanding by the end user.

I know Adobe Acrobat phrases it that way

https://helpx.adobe.com/acrobat/using/file-format-options-pdf-export.html

but in reality, PDF has an embedded font mapping system for rendering,
which makes “text encoding” kind of irrelevant for rendering.
you might have seen PDF that display fine but copy and paste or searching is 
garbage.
I think that it the scenario the user want’t to avoid.

2020/01/14 2:35、Koen Van Hooreweghe via 4D_Tech 
<[email protected]<mailto:[email protected]>>のメール:
FWIW, BBEdit also guesses what the text file encoding could be. It does a good 
job, but it can be fooled.



**********************************************************************
4D Internet Users Group (4D iNUG)
Archive:  http://lists.4d.com/archives.html
Options: https://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:[email protected]
**********************************************************************

Re: Document encoding

Reply via email to