Gudrun,

            Thanks for the kind words.  You are in a uniquely thorny position 
as far as scanning goes because you're trying to work in material that is in 
two separate languages.  OCR has gotten extremely good, even on pretty sketchy 
copy, for material in a single language but I don't know whether the same can 
be said for things in two languages.  As you know from our private exchanges, 
I've tried scanning one of the books you refer to in both Swedish and English 
and neither gives satisfactory results in both languages at once.

            This is an instance where you could be the avant garde for your own 
needs and those of others, too.  I know you've given Tracker Software's PDF 
XChange Viewer a try, and what they provide for free is pretty remarkable, but 
not enough to get what you need.  I do not know whether one of their paid 
products might work, but it would be worth getting in touch with them to ask 
about that.  If they claim that one would, I think it would be entirely 
reasonable to ask them to process a single file using that software and 
returning the result to you so that you can actually evaluate the result before 
considering purchase.  I can't believe that the need for bilingual OCR is 
frequent, but it certainly is something that's going to occur for someone other 
than yourself.  I would be shocked if someone has not developed something that 
supports this, but I'd have to do the same digging as you will to determine who.

             As to finding the right assistant, no matter how much we love our 
loved ones and they love us in return, they're generally not the best option 
for several reasons.  This is particularly so if someone is easily frustrated 
with technology that doesn't function perfectly all the time.  A big part of it 
all is dealing with the inevitable issues that arise when you are putting 
software layer upon software layer and expecting them all to seamlessly 
communicate with each other.  While we've come a long way in that department, 
you know better than most that we're not "there yet" as far as true 
seamlessness goes.

             These days you can pretty much configure monolingual OCR to handle 
most of the formatting eventualities you typically see in print.  Most can be 
easily set up to handle columns like in a newspaper, or tables.  There are 
still problems, though, when a table is presented in columns but with no 
structure surrounding the table itself.  Then the software has to make a guess 
as to whether it's something in "newspaper columnar" format or "table row and 
columnar" format and that's not easy to do absent clear delimiters that suggest 
one versus the other.

Brian

Reply via email to