On Saturday 14 February 2015 02:20 PM, Vaidheeswaran wrote:
Specifically, in the pdftotext case above, I believe the best action would be to M-x flush-lines that match ^L so that page headers are stripped.
I was writing from memory. I should have said this instead: The best action would be to flush page headers 'surrounding' ^L and to 'splice' the paragraph lines (that are split apart) at the pagebreaks. Essentially, for right repair, human intervention is a rule rather than an exception.