Thanks, Pascal. I don't want to analyze them -- I simply want to ignore them and not have the process die if it encounters a file type (or contents) that isn't some form of text.

Pascal Coupet wrote:
Hi Dennis,

You will not be able to analyze directly documents in binary format
using the UIMA samples. There is no default converter included in the
package. So you should first convert them into a text file (using "save
as"by example) or implement an external converter within your
application.

Pascal

-----Original Message-----
From: Dennis Geller [mailto:[EMAIL PROTECTED] Sent: Thursday, March 06, 2008 10:10 AM
To: [email protected]
Subject: Re: Bewildered

Sorry that I was unclear. The bad characters appeared when I took the compiled tutorial and pointed it at a directory of mine, rather than the

one that came with the tutorial (no data problems in there!).

Could be that there was a jpeg in the directory, or an embedded image in

a word document.  I'll follow up on that tutorial reference . Thanks.
I just almost had a successful run. However, it coughed because a file had a "non-XML character, 0x0."
Where was this character? If it was in your XML descriptors, then that needs to be corrected. It is possible to analyze arbitrary data,

including "byte" data containing any characters, in UIMA; see
http://incubator.apache.org/uima/downloads/releaseDocs/2.2.1-incubating/
docs/html/tutorials_and_users_guides/tutorials_and_users_guides.html#ugr
.tug.aas.sofa_data_formats
This also happened when i was running the unmodified tutorial
example.
Can you say where this character occurred in the unmodified tutorial example

-Marshall


--
***********************************
Dennis Geller, Ph.D. Computer and Communication Science Senior Software Developer Direct Dial: 781.496.2461 Main Number: 781.935.3966 ext. 261 Fax Number: 781.496-2498
E-mail:  [EMAIL PROTECTED]
Aptima, Inc.
12 Gill Street, Suite 1400
Woburn, MA 01801 USA
http://www.aptima.com
************************************


The information transmitted is intended only for the person or entity to which 
it is addressed and may contain confidential and/or privileged material. Any 
review, retransmission, dissemination or other use of, or taking of any action 
in reliance upon this information by persons or entities other than the 
intended recipient is prohibited. If you received this in error, please contact 
the sender and delete the material from any computer.

Reply via email to