The text is copied using Apple’s PDFKit. So we do not have any control over this.
Christiaan > On 9 Jun 2022, at 17:09, Alan Harper (lists) <l...@alanharper.com> wrote: > > I have five apps that I regularly use to read and manipulate pdfs. They are: > Skim, PdfPen, PDF Expert, Acrobat Reader and Preview. Of these, I like the > user interface of Skim the best, but the app that most reliably gives me > usable text when I copy from the OCR layer is PdfPen. I don’t know if the > copying is controlled by the MacOS Toolbox or is implemented separately by > each app. > > I wish that Skim could copy text as reliably as PdfPen. It is nice that Skim > normally wraps line-endings (though hyphens, not shown here, befuddle it). > But I have macros that I use to remove the line-endings, making the result > from PdfPen most easily usable. > > For each pdf reader I get the following results when copying from a scanned > dissertation and pasting into a text editor (TextMate). Note that Skim and > Preview give identical results, which I suspect means that they are using the > same code. I OCR’d this document using PdfPen. I tried to enclose the page > from the dissertation as an attachment, but the message was rejected by the > list serve. > > Copied using PdfPen: > The Cañon Santo Tomás contains the lower 16 miles of the > arroyo, traversing the rounded coastal hills of the flanking Rosario > formation. Sills of the outcropping Alisitos formation constrict the > canyon at several spots, forcing ground water to surface for segments > of a mile or more, before the water percolates into stretches of > uncompacted alluvium that mark the arroyo channel. Finally, a fresh > water marsh terminates the lower 3,000 meters of the arroyo, caused > > Copied using Skim: > The Cañon Santo Tomás contains the lower 16 miles o f the arroyo, traversing > the rounded coastal hills of the flanking Rosario formation. S i l l s of the > outcropping Alisitos formation constrict the canyon at several spots, forcing > ground water to surface for segments of a mile or more, before the water > percolates into stretches of uncompacted alluvium that mark the arroyo > channel. F i n a l l y , a fresh > water marsh terminates the lower 3,000 meters of the arroyo, caused > > Copied using Acrobat Reader: > The Cañon S a n t o Tomás c o n t a i n s t h e l o w e r 1 6 m i l e s o f t > h e > arroyo, t r a v e r s i n g t h e rounded c o a s t a l h i l l s o f t h e f > l a n k i n g R o s a r i o > formation. S i l l s o f t h e o u t c r o p p i n g A l i s i t o s f o r m > a t i o n c o n s t r i c t t h e > canyon a t s e v e r a l s p o t s , f o r c i n g ground w a t e r t o s u r > f a c e f o r segments > of a m i l e o r m o r e , b e f o r e t h e w a t e r p e r c o l a t e s i > n t o s t r e t c h e s o f > uncompacted a l l u v i u m t h a t mark t h e a r r o y o c h a n n e l . F > i n a l l y , a f r e s h > water marsh t e r m i n a t e s t h e l o w e r 3 , 0 0 0 m e t e r s o f t h > e a r r o y o , c a u s e d > > Copied using PDF Expert: > The Cañon S a n t o Tomás c o n t a i n s t h e l o w e r 1 6 m i l e s o f t > h e arroyo, t r a v e r s i n g t h e r o u n d e d c o a s t a l h i l l s o > f t h e f l a n k i n g R o s a r i o formation. S i l l s o f t h e o u t c > r o p p i n g A l i s i t o s f o r m a t i o n c o n s t r i c t t h e > canyon a t s e v e r a l s p o t s , f o r c i n g g r o u n d w a t e r t o > s u r f a c e f o r segments of a m i l e o r m o r e , b e f o r e t h e w a > t e r p e r c o l a t e s i n t o s t r e t c h e s o f uncompacted a l l u v > i u m t h a t m a r k t h e a r r o y o c h a n n e l . F i n a l l y , a f r > e s h water marsh t e r m i n a t e s t h e l o w e r 3 , 0 0 0 m e t e r s o > f t h e a r r o y o , c a u s e d > > Copied using Preview: > The Cañon Santo Tomás contains the lower 16 miles o f the arroyo, traversing > the rounded coastal hills of the flanking Rosario formation. S i l l s of the > outcropping Alisitos formation constrict the canyon at several spots, forcing > ground water to surface for segments of a mile or more, before the water > percolates into stretches of uncompacted alluvium that mark the arroyo > channel. F i n a l l y , a fresh > water marsh terminates the lower 3,000 meters of the arroyo, caused > > Alan Harper > a...@alanharper.com <mailto:a...@alanharper.com> ← for people > l...@alanharper.com <mailto:l...@alanharper.com> ← for machines
_______________________________________________ Skim-app-users mailing list Skim-app-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/skim-app-users