On Tue, 5 Nov 2013, Nick White wrote:
Hi Patrick,
1) Is it normal for tesseract to give different results on different operating
systems?
2) If so, what sort of things accounts for the differences?
3) Is it possible to get more consistent results through configuration?
There are examples of it behaving differently according to the
compiler and platform on the TestingTesseract wiki page, too:
http://code.google.com/p/tesseract-ocr/wiki/TestingTesseract
I very much doubt that you could make configuration changes to get
rid of the differences. I suppose it's largely due to different
choices the compilers make when presented with code that could be
interpreted and optimised more than one way.
In the past, I worked on trying to get consistent behaviour
in Monte Carlo physics simulations across platforms. There
were some differences in floating point behaviour across
architectures, but such effects are small and shouldn't
have too large an affect on the results here. If differences
in the 6th or 7th significant digit are causing totally
different results, then it's an indication that the
code is not well thought out in places.
The largest effects were caused by variable initialization.
Again, this is more of an issue with sloppy code. The
flag for warning about uninitialized variables should be
set, and any warnings should be hunted down and fixed.
From that wiki page, it looks like the code was audited
for this back in the 2.0 days, but enough code has been
added since then that new bugs can have crept in again.
So, I would say that totally different results with code
compiled with different options or different compilers
is a sign of bad code. I don't think it should just be
shrugged off as unavoidable.
Cheers,
Rob Komar
--
--
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en
---
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.