>>>>> "SW" == Steve White <stevan.wh...@gmail.com> writes:

SW> So far, the results are erratic, but
SW> * nothing created on Linux copies Hindi text correctly from PDF files.

The proper solution for text extraction from PDF files, especially for
complex and/or r2l scripts, is for the PDF creator to include ActualText
objects in the PDF.

Nothing else can work for all scripts.

Cf §10.8.3 Replacement Text in PDFReference17.pdf; the same is §14.9.4
   in PDF32000_2008.pdf.

-JimC
-- 
James Cloos <cl...@jhcloos.com>         OpenPGP: 1024D/ED7DAEA6

Reply via email to