In UIma doc I read this :
Accessing Sofa Data using a Java Stream
The framework provides a consistent method for accessing the Sofa data,
independent of it being stored locally, or accessed remotely using the URI.
Get a Java InputStream instance from the Sofa data using:
InputStream inputStream = aCas.getSofaDataStream();
-
If the data is local, this method returns a ByteArrayInputStream. This
stream provides bytes.
-
If the Sofa data was set using setDocumentText or setSofaDataString,
the String is converted to bytes by using the UTF-8 encoding.
-
If the Sofa data was set as a DataArray, the bytes in the data array
are serialized, high-byte first.
-
If the Sofa data was specified as a URI, this method returns the handle
from url.openStream(). Java offers built-in support for several URI schemes
including “FILE:”, “HTTP:”, “FTP:” and has an extensible mechanism,
URLStreamHandlerFactory, for customizing access to an arbitrary URI. See
more details at
http://java.sun.com/j2se/1.4.2/docs/api/java/net/URLStreamHandlerFactory.html.
Do you know how to use URLStreamHandlerFactory to replace getDocumentText()
and read directly from FILE ? any simple example code ?
2009/7/9 Radwen ANIBA <[email protected]>
> what do you mean by "wrap" ? do you mean "cast" ?
>
> I dont find getSofaDataStream() method ? I have getSofaDataURI() is it the
> same thing ?
>
>
> Radwen
>
> 2009/7/9 Ramon Ziai <[email protected]>
>
> Hi Radwen,
>>
>> I assume you want to get the document text as a stream? There are two
>> pretty straightforward solutions:
>>
>> 1) Use aJCas.getSofaDataStream() and wrap the InputStream in an
>> InputStreamReader
>>
>> 2) Use aJCas.getDocumentText() and wrap the String in a StringReader
>>
>> Then wrap either of them in a BufferedReader.
>>
>> Best,
>> Ramon
>>
>> Radwen ANIBA schrieb:
>> > Hello everyone,
>> >
>> > How to get the path of the document to be analysed in UIma
>> >
>> > Instead of having String txt = aJCas.getDocumentText(); I need to use it
>> as
>> > Bufferedreader to achieve some process is it possible to modify the
>> > getDocumentText() ?
>> >
>> > Radwen
>> >
>>
>>
>