I think you should first think about what encoding means, in general file encoding means the encoding of text files. If you use your code below with pdf files you destroy the pdf in a very reliable way, since pdf files are binary files. The same is with nearly all picture formats (and with Word and Excel files). But binary files can contain section with text, this text has to be encoded in a certain way. To know which sections are to be encoded, you have to consult the documentation of the file formats. But usually tools which generate such files have options to do this setting by the user.
Regarding the second part of the question, how to detect the current encoding: This can be a bit cumbersome, since plain text files doesn't have a marker which encoding is used. Sometimes there is a so-called BOM in the first bytes of the file if it is in Unicode encoding, but you can't rely on it. 4D can help a bit: If you try to read a text file with the wrong encoding (and it contains bytes that can't be decoded) you will get an empty result. But it could be possible that the content is decoded in a wrong way and you don't get the right result. Regards Lutz -----Ursprüngliche Nachricht----- Von: 4D_Tech [mailto:[email protected]] Im Auftrag von Two Way Communications via 4D_Tech Betreff: Document encoding Hi All, An important customer of mine has requested that all documents, sent to him, are UTF-8 encoded. This concerns PDF files, text files, Word, Excel, picture files. I did some tests, but can’t figure out how to do that. If, e.g., I look at a pdf file in BBEdit, it says ‘Mac Roman’. Then I tried to open that file in 4D (v17, UTF-8) with document to blob then: DOCUMENT TO BLOB(document;$blob) $DocBlobtxt:=Convert to text(blob;2027) // 2027 = MacOS Roman TEXT TO BLOB($DocBlobtxt;$docblobUTF8;UTF8 text without length) It seems to do that correctly, but then, this file cannot be opened in preview ( Opens, but content is blanc) The other thing is that I need to know the encoding of the file before using ‘Convert to text’. That is not always possible. Is this request feasible to start with? Any ideas how to accomplish that? Regards, Rudy Mortier Two Way Communications bvba ********************************************************************** 4D Internet Users Group (4D iNUG) Archive: http://lists.4d.com/archives.html Options: https://lists.4d.com/mailman/options/4d_tech Unsub: mailto:[email protected] **********************************************************************

