Re: [transcode-users] Re: New subtitle ripper

Christian . Wasserthal Thu, 24 Aug 2006 10:28:42 -0700

Zitat von T <[EMAIL PROTECTED]>:
> *snip*
>
> Could you shed more light onto it please? Ie.
>
> - What kind of OCR mechanism are you using?
> - I haven't tried it, but I see that you're using the Java Swing interface.
>   Does it require many user interaction? Any way to automate the process?
> - Does your sub2text depend on some dictionary to do the auto correction?
> - Any error report for mis-spelled words, etc?
> - Would it work for other languages than English?
>
> > You can
> > download it from sourceforge: http://sourceforge.net/projects/sub2text/ It
> > is however the first time I release or even work on a public project.
>
> thanks a lot.
>
> tong


Hello Tong.
Thank you for replying to my mail : ). And sorry for not checking my mail
for such a long time.

1. The mechanism I am using to recognize characters is the following:
   I look for connected areas (4-neighborhood) in the 2-color image. One
   or more areas form a 'shape' which lies in a small database and is
   associated with an unicode-letter.
   So a 'shape' is represented by a list of pixel positions relative to
   the seed point (the leftmost of the topmost pixels in the shape).

2. When the database is empty, every letter found has to be fulfilled
   (that means that all areas that belong to it have to be selected)
   and the unicode representation has to be entered.
   Also user-interaction is needed to resolve ambiguous characters. (In
   a lot of DVD-fonts the uppercase i and the lowercase L look exactly
   the same).
   But the interface is very fast, you don't need to touch the mouse
   often. The arrowkeys and return do most of the job.
   (I just thought of using gocr for guessing the letter...)

3. No. Also there is no 'auto'-detection (in the way of software-
   intuition). But aspell can be used to solve the ambiguousness's.

4. No. Only words containing ambiguous characters are check in this
   way.

5. It _should_ be ready for all unicode-eventualities.

I hope I can make a screenshot tutorial soon.

-- 
Chris

Re: [transcode-users] Re: New subtitle ripper

Reply via email to