I recently came across this paper, which states the algorithm involved can be applied to PDF. I thought it might be worth looking at to see if any of the work could be used in improving the pdf-to-text output of poppler.
http://www.cs.waikato.ac.nz/~ihw/papers/98NM-Reed-IHW-Extract-Text.pdf
_______________________________________________ poppler mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/poppler
