On Thu, May 21, 2015 at 7:40 PM, Jan David Hauck <jan.d.ha...@ucla.edu>
wrote:
> > Any ideas on what makes a particular PDF more likely to display such
> behavior?
>
> I've come across the problem with scanned PDFs as well as with LaTeX
> generated PDFs.
> When I would modify them in Preview (e.g., deleting a page, rearranging
> pages, extracting pages, etc.) then the character mapping would get lost
> and there is no way to recover from that (other than regenerating the
> PDF). That's why I only use Acrobat if I have to edit PDFs (although it
> sucks, because all the drag and drop functions in Preview are much easier).
>
For me it happens if I have a combination of a non-standard embedded fonts
and non-English characters, such as Umlauts.
I can only speculate, but I wonder whether you, Jan, have it happening
because you use something akin to Adobe's ClearScan (
http://blogs.adobe.com/acrolaw/2009/05/better_pdf_ocr_clearscan_is_smal/)
when creating OCRd PDFs, which would result in non-standard fonts being
embedded. I think some scanners might even do that automatically (wasn't
there one that would randomly confuse characters?). I just tried it by
using the ClearScan OCR setting in Adobe for the first time on a text in
French (lots of accents, PDFKit can't handle them at all, try copy and
pasting them!), and sure enough I had my first ever scanned PDF that would
become corrupted by PDFKit. With all my other scans, I have always left the
picture layer intact and just let Adobe or OCRKit add a text layer. These
PDFs would not be corrupted by PDFKit.
JJ
> On Thu, May 21, 2015 at 6:43 AM, Josh Carney <joshcar...@hotmail.com>
> wrote:
>
>> Thanks for your reply, Jan.
>>
>> When I wrote "unsearchable" I was simply referring to the interface
>> practice that I use most frequently in dealing with such PDFs. That is,
>> using the command+f option to search for a particular sequence of
>> characters, or in the case of DevonThink, searching the text of the
>> document alongside that of others in aggregate form.
>>
>> As you allude to, the problem is actually greater than this. Though I can
>> select and even highlight text in this file for annotation, attempts to
>> copy it end up in some sort of strange tab-like space being entered in the
>> paste area. This can only be deleted by removing textual elements on either
>> side of it. (That is, it can't simply be backed-over and deleted--in such
>> attempts, the cursor just won't move.) No characters display in the copied
>> "text," just a space that might be about 20 characters (an estimate) long.
>>
>> If this is a frequent problem with the PDFKit I guess I should count
>> myself lucky. Any ideas on what makes a particular PDF more likely to
>> display such behavior? (This particular error is very repeatable with this
>> particular file.)
>>
>> Best wishes,
>>
>> Josh
>>
>> > From: skim-app-users-requ...@lists.sourceforge.net
>> > Subject: Skim-app-users Digest, Vol 96, Issue 15
>> > To: skim-app-users@lists.sourceforge.net
>> > Date: Thu, 21 May 2015 12:02:11 +0000
>> >
>> > Send Skim-app-users mailing list submissions to
>> > skim-app-users@lists.sourceforge.net
>> >
>> > To subscribe or unsubscribe via the World Wide Web, visit
>> > https://lists.sourceforge.net/lists/listinfo/skim-app-users
>> > or, via email, send a message with subject or body 'help' to
>> > skim-app-users-requ...@lists.sourceforge.net
>> >
>> > You can reach the person managing the list at
>> > skim-app-users-ow...@lists.sourceforge.net
>> >
>> > When replying, please edit your Subject line so it is more specific
>> > than "Re: Contents of Skim-app-users digest..."
>> >
>> >
>> > Today's Topics:
>> >
>> > 1. Re: search disables after export save (Jan David Hauck)
>> >
>> >
>> > ----------------------------------------------------------------------
>> >
>> > Message: 1
>> > Date: Wed, 20 May 2015 10:28:10 -0700
>> > From: Jan David Hauck <jan.d.ha...@ucla.edu>
>> > Subject: Re: [Skim-app-users] search disables after export save
>> > To: For general discussion about using Skim
>> > <skim-app-users@lists.sourceforge.net>
>> > Message-ID:
>> > <CAE4y4F=cjewz4uoh-w-ldtce9x73f0od55ivgtru4k9ojrt...@mail.gmail.com>
>> > Content-Type: text/plain; charset="iso-8859-1"
>> >
>> > What do you mean by "unsearchable"?
>> > Can you select text from the file? Can you copy?
>> > If you copy text from the file into a text document, does it display
>> > correctly or are the characters scrambled?
>> >
>> > I've come across the latter when modifying scanned and OCR'ed PDFs in
>> > Preview. For some reason PDFKit frequently screws up the character
>> mapping
>> > of scanned pdfs. This has been a problem for probably a decade now and
>> > Apple has been unable to fix it.
>> >
>> > Jan
>> >
>> >
>> >
>> > On Wed, May 20, 2015 at 3:40 AM, Josh Carney <joshcar...@hotmail.com>
>> wrote:
>> >
>> > > Thanks for your reply, Christiaan. In answer to your question, yes,
>> that's
>> > > just what I meant. In terms of the PDFKit it looks like you're right:
>> I
>> > > replicated the issue with Preview and I don't get the issue with
>> Acrobat
>> > > Pro. I'm very glad to hear this is an exception, though I'm still
>> curious
>> > > to know if others have experienced a similar problem. I'm wondering if
>> > > there's a way to anticipate what PDFs might be prone to this sort of
>> > > behavior.
>> > >
>> > > Josh
>> > >
>> > > > Hello,
>> > > >
>> > > > My apologies in advance if this has been covered before. I didn't
>> find
>> > > it discussed in my forum search under the terms I selected.
>> > > >
>> > > > I have at least one PDF (an article downloaded from project muse)
>> that
>> > > becomes unsearchable after I save with the"export - as PDF - with
>> embedded
>> > > notes" option. That is, after I attempt to shift from Skim format to a
>> > > format in which my annotations are available to other PDF
>> applications. For
>> > > those who might be interested and have access, the article is
>> available at
>> > > this link:
>> > > >
>> > >
>> https://muse.jhu.edu/journals/journal_of_middle_east_womens_studies/v005/5.2.barzilai-lumbroso.pdf
>> > > >
>> > > > The issue is repeatable with this item, but is not occurring with
>> some
>> > > of the other articles I recently worked with. That said, it has me
>> > > concerned, as I only became aware of the phenomenon when searching
>> this
>> > > particular article. I generally annotate, export as PDF, and store my
>> > > articles in the DevonThink Pro Office database, where I often search
>> them
>> > > in aggregate rather than on an article-by-article basis. So I might
>> not
>> > > know if there are other instances of similar behavior.
>> > > >
>> > > > Is this behavior familiar and, if so, do you have any suggestions
>> for
>> > > dealing with it? For the time being I will re-annotate this
>> particular PDF
>> > > in Acrobat Pro or another program, since I need to have the
>> annotations
>> > > visible across a variety of platforms, but I would prefer to stick to
>> the
>> > > Skim-based workflow that I have.
>> > > >
>> > > > Thanks in advance for any thoughts you might be able to offer.
>> > > >
>> > > > Josh
>> > >
>> > > Do you mean that when you open a file that was exported with embedded
>> > > notes it becomes unsearchable? I have not heard about that. But it is
>> also
>> > > something we cannot do anything about. It's really Apple's problem.
>> That is
>> > > because the PDF is saved by Apple's PDFKit, which for us is a black
>> box, we
>> > > have no influence over how that is done, and regularly has problems
>> (that's
>> > > an important reason why we don't use it by default.) For sure, this
>> must be
>> > > an exception. Also, it would be true for other PDFKit based apps, like
>> > > Preview.
>> > >
>> > > Christiaan
>> > >
>> > >
>> > >
>> ------------------------------------------------------------------------------
>> > > One dashboard for servers and applications across
>> Physical-Virtual-Cloud
>> > > Widest out-of-the-box monitoring support with 50+ applications
>> > > Performance metrics, stats and reports that give you Actionable
>> Insights
>> > > Deep dive visibility with transaction tracing using APM Insight.
>> > > http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
>> > > _______________________________________________
>> > > Skim-app-users mailing list
>> > > Skim-app-users@lists.sourceforge.net
>> > > https://lists.sourceforge.net/lists/listinfo/skim-app-users
>> > >
>> > >
>> > -------------- next part --------------
>> > An HTML attachment was scrubbed...
>> >
>> > ------------------------------
>> >
>> >
>> ------------------------------------------------------------------------------
>> > One dashboard for servers and applications across
>> Physical-Virtual-Cloud
>> > Widest out-of-the-box monitoring support with 50+ applications
>> > Performance metrics, stats and reports that give you Actionable Insights
>> > Deep dive visibility with transaction tracing using APM Insight.
>> > http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
>> >
>> > ------------------------------
>> >
>> > _______________________________________________
>> > Skim-app-users mailing list
>> > Skim-app-users@lists.sourceforge.net
>> > https://lists.sourceforge.net/lists/listinfo/skim-app-users
>> >
>> >
>> > End of Skim-app-users Digest, Vol 96, Issue 15
>> > **********************************************
>>
>>
>> ------------------------------------------------------------------------------
>> One dashboard for servers and applications across Physical-Virtual-Cloud
>> Widest out-of-the-box monitoring support with 50+ applications
>> Performance metrics, stats and reports that give you Actionable Insights
>> Deep dive visibility with transaction tracing using APM Insight.
>> http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
>> _______________________________________________
>> Skim-app-users mailing list
>> Skim-app-users@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/skim-app-users
>>
>>
>
>
> ------------------------------------------------------------------------------
> One dashboard for servers and applications across Physical-Virtual-Cloud
> Widest out-of-the-box monitoring support with 50+ applications
> Performance metrics, stats and reports that give you Actionable Insights
> Deep dive visibility with transaction tracing using APM Insight.
> http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
> _______________________________________________
> Skim-app-users mailing list
> Skim-app-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/skim-app-users
>
>
------------------------------------------------------------------------------
One dashboard for servers and applications across Physical-Virtual-Cloud
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
_______________________________________________
Skim-app-users mailing list
Skim-app-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/skim-app-users