On Thu, Apr 12, 2012 at 7:52 AM, Brian Matherly <[email protected]> wrote:
>> Thanks for what you are doing - I am more of a mlt user than dev, but
>
>> I am coder and completely appreciate the value of unit tests.
>>
>> I have a lengthy and tangled wad of python code that tests a bunch of
>> stuff.  In it's current state it isn't a reliable test - when it fails
>> I have to go track down what failed, and 1/2 the time it's essentially
>> a bug in the test.
>>
>> I would be up for making it fit for human consumption, but there is a
>> blocker: the debian pocketsphinx-utils package is deprecated - there
>> currently is no maintainer and what's currently packaged is pretty out
>> of date.  I took a crack at packaging it, failed.  so waiting for
>> someone else.
>>
>> Here is what my script does:
>>
>> renders the text ABCDE into .dv files, uses melt to mux them with an
>> audio file of a voice saying "go forward ten meters" and encode, the
>> demuxes the audio back out to a file and pulls a frame out, runs it
>> though gocr and looks for the string ABCDE, and runs the audio though
>> sphinx, text comes out, if all is well  "go forward ten meters"
>> (don't be surprised, the input file is a sample from the sphinx
>> codebase)
>>
>> So it tests that melt can render text, encode and end up with a result
>> that looks and sounds acceptable.  (basically.  I am sure there are
>> cases where it may look/sound terrible but the automated test will be
>> smart enough to figure out the strings anyway.)
>>
>> It also tests that my django code can connect to the currently
>> configured database, and a bunch of other stuff that has nothing to do
>> with testing melt.  again, not fit for human consumption.
>>
>> https://github.com/CarlFK/veyepar/blob/master/dj/scripts/run_tests.py
>>
>> Because I can no longer apt-get install sphinx (or whatever the
>> package is called, it no longer exits in the repo) I disabled the
>> audio tests from my script.
>>
>> So if someone can package current sphinx (or any other speech
>> recognition system) I'll rip my code apart into components and submit
>> them to your collection.
>
> Thanks Carl, I'll have a look at your scripts. I'm sure there are some golden 
> nuggets in there. The OCR angle is particularly interesting to me - it hadn't 
> occurred to me before. Hopefully, when this is up and running, it will be 
> trivial for anyone to submit a new test script (however trite it may be) and 
> automatically add it to the mix.

I seriously recommend against looking at my code, unless you want to
get involved in that project.   I can extract way easier than you can,
and I'll also be trying to make sure the result is something I can
hook into so that my tests don't need to include a hacked copy.

but if you insist.. :)

here is the extract/ocr thing:

  cmd = "mplayer \
-ss 12 \
-vf framestep=20 \
-ao null \
-vo pnm:outdir=%(tmp_dir)s \
%(blip_file)s" % parms
  print cmd
  self.run_cmd(cmd.split())

  test_file = os.path.join(tmp_dir, "00000006.ppm" )
  gocr_outs = self.run_cmd(['gocr', test_file], True )
  text = gocr_outs['sout']
  print text

  # not sure what is tacking on the \n, but it is there, so it is here.
  result = (text in ["ABCDEFG\n","_BCDEFG\n"])

the A/_ thing is because gocr doesn't always hit the A, but I see an A
that looks fine, so as long as it gets there rest, I am satisfied.
Remember, the OCR recognition isn't the same as what our eyes and
brains are doing, so we have to allow for some slop.

-- 
Carl K

------------------------------------------------------------------------------
For Developers, A Lot Can Happen In A Second.
Boundary is the first to Know...and Tell You.
Monitor Your Applications in Ultra-Fine Resolution. Try it FREE!
http://p.sf.net/sfu/Boundary-d2dvs2
_______________________________________________
Mlt-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/mlt-devel

Reply via email to