In UIma doc I read this :

Accessing Sofa Data using a Java Stream

The framework provides a consistent method for accessing the Sofa data,
independent of it being stored locally, or accessed remotely using the URI.
Get a Java InputStream instance from the Sofa data using:

InputStream inputStream = aCas.getSofaDataStream();


   -

   If the data is local, this method returns a ByteArrayInputStream. This
   stream provides bytes.
   -

      If the Sofa data was set using setDocumentText or setSofaDataString,
      the String is converted to bytes by using the UTF-8 encoding.
      -

      If the Sofa data was set as a DataArray, the bytes in the data array
      are serialized, high-byte first.

   -

   If the Sofa data was specified as a URI, this method returns the handle
   from url.openStream(). Java offers built-in support for several URI schemes
   including “FILE:”, “HTTP:”, “FTP:” and has an extensible mechanism,
   URLStreamHandlerFactory, for customizing access to an arbitrary URI. See
   more details at
   
http://java.sun.com/j2se/1.4.2/docs/api/java/net/URLStreamHandlerFactory.html.


Do you know how to use URLStreamHandlerFactory to replace getDocumentText()
and read directly from FILE ? any simple example code ?

2009/7/9 Radwen ANIBA <[email protected]>

> what do you mean by "wrap" ? do you mean "cast" ?
>
> I dont find getSofaDataStream() method ? I have getSofaDataURI() is it the
> same thing ?
>
>
> Radwen
>
> 2009/7/9 Ramon Ziai <[email protected]>
>
> Hi Radwen,
>>
>> I assume you want to get the document text as a stream? There are two
>> pretty straightforward solutions:
>>
>> 1) Use aJCas.getSofaDataStream() and wrap the InputStream in an
>> InputStreamReader
>>
>> 2) Use aJCas.getDocumentText() and wrap the String in a StringReader
>>
>> Then wrap either of them in a BufferedReader.
>>
>> Best,
>> Ramon
>>
>> Radwen ANIBA schrieb:
>> > Hello everyone,
>> >
>> > How to get the path of the document to be analysed in UIma
>> >
>> > Instead of having String txt = aJCas.getDocumentText(); I need to use it
>> as
>> > Bufferedreader to achieve some process is it possible to modify the
>> > getDocumentText() ?
>> >
>> > Radwen
>> >
>>
>>
>

Reply via email to