On Sun, Jan 25, 2009 at 12:58 AM, Christiaan Hofman <[email protected]>wrote:

>
> On 25 Jan 2009, at 11:17 AM, Hydro Meteor wrote:
>
>
> On Wed, Jan 14, 2009 at 4:43 AM, Christiaan Hofman <[email protected]>wrote:
>
>>
>> On 14 Jan 2009, at 3:04 PM, Adam M. Goldstein wrote:
>>
>> > On Jan 14, 2009, at 7:12 AM, Christiaan Hofman wrote:
>> >
>> >> Tesseract is an example of what I was calling "it won't be good
>> >> enough". It's source code for a command line tool, not a program, and
>> >> it does only text analysis, not layout analysis. The latter is also
>> >> crucial to be able to select. And it certainly does not output PDF.
>> >> So
>> >> you're still (very) far from having selectable PDFs, as Noam is
>> >> asking
>> >> for. Unfortunately.
>> >
>> > A layout tool called "ocropus" integrates tesseract to give better
>> > quality results than with tesseract alone. At the google pages about
>> > this (http://sites.google.com/site/ocropus/platforms/os-x) it is
>> > claimed that it has been successfully compiled on OSX, although Linux
>> > seems to be the main target platform. Google claims that this
>> > combination works as well as commercially available OCR software. They
>> > seem to have a vested interest in this because they want to get the
>> > text from all of the scanned images of library books in their google
>> > library project.
>> >
>>
>> I also saw that project. It indeed takes the next step, but still far
>> from sufficient.
>>
>> > Anyhow, I don't know how you'd manipulate the scanned text to match
>> > the PDF so text can be selected.
>>
>> As I mentioned in the RFE about this, it really is a big show stopper
>> for integration in Skim, because we simply have no access to the
>> PDFKit internals to patch.
>
>
> Is PDFKit a moving target (meaning, its closed up by Apple thus no access
> to source code)? What about in the context of GnuStep? Since Skim can be
> compiled (at least in theory) to run on GnuStep,
>
>
> Who says that? It most definitely can not (if not just because GnuStep
> doesn't have PDFKit yet).
>

Oops, sorry, I thought that was possible (I thought something was possible
with regard to Skim and GnuStep -- I need to re-check my notes from weeks
ago).


>
> BTW, AFAIC, even IF it would be possible, I certainly wouldn't spend the
> (enormous amount of) time to implement it.
>

I understand -- there might not be enough interest / uptake / use to warrant
the effort.


>
> So anyone who thinks it's possible is welcome to join and implement it.
> keeping on asking us is pointless.
>

Agreed -- questions ad nauseam won't change what is possible or what is
probable!


>
> Christiaan
>
> would it be possible to combine Skim +  ocropus + tesseract under the
> context of GnuStep? That would be a potentially rocking solution. I'd love
> to see it. It just so happens that I'm in the market for buying a scanner
> and I want a sheetfeeder (probably will get a Fujitsu ScanSnap). I'm looking
> at SANE for open source scanning capability. To be able to add open source
> OCR with SANE backed scanning and then to top it off with Skim would be
> nirvana. I can well imagine even running this on GnuStep which is itself on
> a virtual machine such as under the auspices of VMWare or Parallels on a
> Linux desktop which itself is running on OS X (the host OS).
>
> Cheers!
>
> [SNIP
>
>
>
>
> ------------------------------------------------------------------------------
> This SF.net email is sponsored by:
> SourcForge Community
> SourceForge wants to tell your story.
> http://p.sf.net/sfu/sf-spreadtheword
> _______________________________________________
> Skim-app-users mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/skim-app-users
>
>
------------------------------------------------------------------------------
This SF.net email is sponsored by:
SourcForge Community
SourceForge wants to tell your story.
http://p.sf.net/sfu/sf-spreadtheword
_______________________________________________
Skim-app-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/skim-app-users

Reply via email to