Sirs,

I had already thought about this graphical approach to reconstruct the
words. I've let it down because I'm a bit sceptical on the reliability of
such a method. I can't help thinking that it will not be a 100% sure
method. I understand why a CAD software would produce such an output,
though (thank you for this new word that I didn't know "boustrophedonic",
but it explains well the result obtained).

Supposing that the characters appear in a totally arbitrary order,
detecting that they're on the same line is more or less piece of cake
(except if I need to introduce a tolerance, which makes things more
difficult), but grouping the characters according to their X position is
not at all an easy task.

But this is not an issue, my problem is more the fact that this method may
not be 100% reliable. What do you think ?

As for the technical part (overloading the processText), it's ok, thanks
for the advice.

Best regards

Julien



2014-03-06 18:39 GMT+01:00 HQS <[email protected]>:

> Hello all,
>
> 1.
> Have you ever seen PDFs having this kind of (pseudo) structure :
>
> BT
> <character>
> Tj
> ET
>
> ?
>
> Which means, the strings are split into characters and there is one block
> of text per character ?
> It seems to be ill-formed doesn't it ?
>
> 2. Reminder of my first mail, what is the library compliancy regarding PDF
> standards ? 1.3 to 1.7 ?
>
>
> Thanks and regards
>
> Julien
>
>

Reply via email to