David #19: you say "it perhaps recognizes column stuff from the display layout instead of the internal representation."
In PDF, the internal representation *is* just the display layout. Internally, poppler tries to divide this text into blocks (roughly paragraphs) which are then grouped into columns based on spacing, and independently into 'flows' (roughly, sequences of similar blocks in reading order), based on a bunch of heuristics. This is already tricky, but is made more complicated by text rotation, and different writing systems (vertical, right to left, etc). Acrobat and Apple's Preview use different heuristics, so they group text differently, and make a mess of things on different documents - but they still make a mess of things. Just explaining what's going on here; this isn't to say that text selection can't be improved. I'm slowly putting together a patch based on the reading order sort described in http://pubs.iupr.org/#2003 -breuel-sdiut , which seems to be fixing some of the problems with the attachment in #7. However as I said to Andres I have no idea when or if my patches would be accepted. -- Evince doesn't handle columns properly https://bugs.launchpad.net/bugs/33288 You received this bug notification because you are a member of Ubuntu Desktop Bugs, which is a direct subscriber. -- desktop-bugs mailing list desktop-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/desktop-bugs