This is somewhat of a side issue, and but when you say fail do you mean no useful text is extracted (gibberish or blank)? Or does extraction fail completely?
G Kristof Keppens <[email protected]> wrote: >You can also check if the fail-on-error is set to true for the text >extraction. That would cause the problem you describe ( we've >encountered the same issue ). Setting the fail-on-error to false will >have the same effect as removing these lines but you will have some text >extraction, although in our experience the text extraction only rarely >succeeds. So you might want to consider skipping it completely. > >Kristof > >On 2011-12-08 12:18, Dr Leslaw Zieleznik wrote: >> Reza, >> >> You probably need to create your own dictionary? >> >> Below is instruction of how to disable text extraction, recently published >> on the list: >> >> ************************************************** >> >> Non-authorative answer: >> >> in $FELIX_HOME/conf/workflow/*.xml >> remove the following lines: >> ------------------------------------------ >> <!-- Run text analysis --> >> <operation >> id="extract-text" >> fail-on-error="false" >> exception-handler-workflow="error" >> description="Extracting text from presentation segments"> >> <configurations> >> <configuration >> key="source-flavor">presentation/trimmed</configuration> >> <configuration key="source-tags"></configuration> >> <configuration key="target-tags">engage</configuration> >> </configurations> >> </operation> >> ------------------------------------------- >> >> Leslaw >> >> On 8 Dec 2011, at 11:08, VISIONAIRE-Reza Toghraee wrote: >> >>> Dear Dr Leslaw >>> >>> >>> Thank you very much for your reply. >>> I had enabled the Audio and attached a mic as well. But today I realized >>> that the Audio mode was set on Line. I changed it to MIC and now at least >>> on preview CGI of MCA, I can hear the voice as well. >>> >>> Currently Im suffering from the OCR. Whenever Im making a recording it is >>> failing during the Text Extraction. Is there any way to disable the Text >>> Extraction from the workflow? >>> >>> Thank you >>> Reza >>> >>> >>> >>> From: [email protected] >>> [mailto:[email protected]] On Behalf Of Dr >>> Leslaw Zieleznik >>> Sent: Thursday, December 08, 2011 1:00 PM >>> To: Matterhorn Users >>> Subject: Re: [Matterhorn-users] MattherHorn 1.2 and Epiphan Capture >>> Appliance Updates and Issues -- New User >>> >>> Hi Reza, >>> >>> 0- till now, after recording more than 10 times, I couldn't manage to have >>> the complete result of recording (Video + VGA + Audio) in Engage. >>> >>> You need to enable the audio and connect the microphone, it will then >>> start recording all three streams. >>> >>> 1- sometimes when recording has to be finished, the server still shows that >>> the MCA is in Capturing state. I don't understand why the status is not >>> being updated. >>> >>> This is know bug, MCA sits in the capture state (the workflow is showing >>> PAUSE) for about 35min and then will do the ingestion. >>> >>> In summary the device is capturing nicely if you take the above into >>> account - see also below. >>> There is more information about the device behave on the Matterhorn Users >>> list. >>> >>> 3- almost most of the times, when the MCA ingests the file to the Core and >>> Core starts digesting, usually it fails in 'Text Extraction" Phase. Is there >>> any way to skip the text extraction phase while digesting? >>> >>> I have no problem with this, only the text OCR is not very accurate. >>> >>> Calendar Polling Interval: 1 minuet >>> Agent State Push Interval: 60 seconds >>> Ingest Interval : 2 minutes (I changed it from 60 minutes to 2 minute to >>> speedup the ingest) >>> >>> I am using a similar setup: 2, 60, 5. >>> >>> Best, >>> Leslaw >>> >> >> >> >> >> _______________________________________________ >> Matterhorn-users mailing list >> [email protected] >> http://lists.opencastproject.org/mailman/listinfo/matterhorn-users > >_______________________________________________ >Matterhorn-users mailing list >[email protected] >http://lists.opencastproject.org/mailman/listinfo/matterhorn-users _______________________________________________ Matterhorn-users mailing list [email protected] http://lists.opencastproject.org/mailman/listinfo/matterhorn-users
