>>>>> "SW" == Steve White <[email protected]> writes:
SW> So far, the results are erratic, but SW> * nothing created on Linux copies Hindi text correctly from PDF files. The proper solution for text extraction from PDF files, especially for complex and/or r2l scripts, is for the PDF creator to include ActualText objects in the PDF. Nothing else can work for all scripts. Cf §10.8.3 Replacement Text in PDFReference17.pdf; the same is §14.9.4 in PDF32000_2008.pdf. -JimC -- James Cloos <[email protected]> OpenPGP: 1024D/ED7DAEA6
