Hi to all:
I'm reading the poppler code and touching something here and there
because I'll implement the atk interface for evince and I need to know
how to get the text of a pdf file from glib.
I want to get the text ordered like you'll read it, I saw that pdftotext
get the text well ordered using the "-raw" option. I looked the code and
I saw that it use TextOutputDev with rawOrder = true.
It's easy to dump the text to a file using the first argument that
receive the TextOutputDev constructor, but I want to get the text as
char *.
I saw that using rawOrder in TextOutputDev you can't use getText method,
it always returns an empty GooString:
...
3603 s = new GooString();
3604
3605 if (rawOrder) {
3606 return s;
3607 }
...
And here is the question, that is a bug/not_implemented_feature or it's
like that for some reason?
If you think that's a bug I could create the bug and upload a patch to
"solve" it using the TextWordList.
_______________________________________________
poppler mailing list
[email protected]
http://lists.freedesktop.org/mailman/listinfo/poppler