Sorry, that comment was added to the wrong thread. It was about the text of 
(highlight) notes.

Christiaan

> On 19 Mar 2022, at 03:29, Mark Roberts <mroberts1...@gmail.com> wrote:
> 
> Just chiming in here: is this something that could be fixed for notes export, 
> too?
> 
> Using Skim 1.6.9, if I create a note for a passage in the PDF that includes a 
> hyphen at the line break, it still includes a soft hyphen (U+00AD) and a 
> space, and I have to trim each of these by hand. FWIW, the PDF was OCR'd from 
> scanned pages using the latest version of Adobe Acrobat.
> 
> I checked some other PDF readers and found that Preview does that same thing 
> (because it also uses PDFKit?), while Adobe Acrobat Pro, FoxIt Reader, and 
> PDF Expert all trim out the soft hyphen, as well as the space.
> 
> It seems that if there are any soft hyphens (U+00AD) followed by a space in a 
> string copied from a PDF, these two characters can safely be trimmed out. 
> Regular hyphens in the PDFs I've checked are represented by U+002D, so there 
> should be no danger of losing them if Skim were to perform this operation on 
> strings.
> 
> I did a web search and found a fair amount of discussion on the interwebs 
> about OCR'd text and soft hyphens, with many people asking how they can fix 
> this problem with various apps.
> 
> Any thoughts about this?
> 
> Thanks again,
> 
> M.
> 
> 
> 
> 
> 
> On Sat, Mar 19, 2022 at 7:53 AM Christiaan Hofman <cmhof...@gmail.com 
> <mailto:cmhof...@gmail.com>> wrote:
> 
> 
>> On 18 Mar 2022, at 23:09, Christiaan Hofman <cmhof...@gmail.com 
>> <mailto:cmhof...@gmail.com>> wrote:
>> 
>> 
>> 
>>> On 18 Mar 2022, at 22:45, Jan David Hauck via Skim-app-users 
>>> <skim-app-users@lists.sourceforge.net 
>>> <mailto:skim-app-users@lists.sourceforge.net>> wrote:
>>> 
>>> Hi all, 
>>> Is there a way to do a search in a PDF with an AND operator? 
>>> In the search field, when checking “whole words only” it returns all pages 
>>> with any of the words in the search field. 
>>> With “whole words only” unchecked, it tries to find the exact phrase in the 
>>> search field. 
>>> I’m trying to find a way to search for pages that contain word A and word B 
>>>  (not either word A or word B). 
>>> Help much appreciated.
>>> Jan
>> 
>> No, that is not supported. You should realize that it searches for strings, 
>> not for pages.
>> 
>> Christiaan
>> 
> 
> 
> Looking at our code, I realized that it still attempts to combine hyphenated 
> words. It just fails, because PDFKit seems to insert spaces between the 
> lines, rather than newlines, so we did not see the hyphens at the end of the 
> lines. I have replaced this by looking for lines in the layed out text, 
> rather than just the strings, with hyphens at the end, and that seems to be 
> working well.
> 
> Christiaan
> 
> _______________________________________________
> Skim-app-users mailing list
> Skim-app-users@lists.sourceforge.net 
> <mailto:Skim-app-users@lists.sourceforge.net>
> https://lists.sourceforge.net/lists/listinfo/skim-app-users 
> <https://lists.sourceforge.net/lists/listinfo/skim-app-users>
> _______________________________________________
> Skim-app-users mailing list
> Skim-app-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/skim-app-users

Christiaan

_______________________________________________
Skim-app-users mailing list
Skim-app-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/skim-app-users

Reply via email to