javadocs now updated. -M
On 8/4/2016 11:08 AM, Marshall Schor wrote: > I'm taking a try at the general documentation for this class; here's what I > have > (written from the point of view of being useful to new users of this class). > > CasIOUtils is a collection of static methods aimed at making it easy to > - save and load CASes, and to > - optionally include their Type Systems and index definitions based on those > type systems (abbreviated TSI). > > There are several serialization formats supported; these are listed in the > Java > enum SerialFormat, together with their preferred file extension name. > > The APIs for loading attempt to automatically use the appropriate > deserializers, > based on the input data format. To select the right deserializer, first, the > file extension name (if available) is used: > - xmi: XMI format > - xcas: XCAS format > - xml: XCAS format > > If none of these apply, then the first few bytes of the input are examined to > determine the format. > > For loading, the inputs may be supplied as URLs or as InputStream. You can > use > Files or Paths by converting these to URLs: > URL url = a_path.toUri().toURL(); > URL url = a_file.toUri().toURL(); > > When loading, an optional lenient boolean flag may be specified; if true, then > types and/or features being deserialized which don't exist in the receiving > CAS > are silently ignored. > > When TSI is saved, it is either saved in the same destination (e.g. file or > stream), or in a separate one. > - Two serialization formats support saving the TSI in the same destination: > -- SERIALIZED_TS and > -- COMPRESSED_FILTERED_TS. > Other formats require the TSI to be saved to a separate OutputStream. > > Summary of the APIs for saving: > > save(CAS, OutputStream, SerialFormat) > save(CAS, OutputStream, OutputStream, SerialFormat) - extra outputStream > for > saving the TSI > > Summary of APIs for loading: > load(URL , CAS) > load(InputStream, CAS) > > load(URL , URL , CAS, lenient_flag) - the second URL is for > loading a separately-stored TSI > load(InputStream, InputStream, CAS, lenient_flag > > You may specify the lenient_flag without the TSI input by setting the 2nd > argument to null. > =============================================================================== > > To make this documentation correct, the impl needs some slight adjustments: > > The method for reading the first few bytes of input to determine the format: > should look for XCAS format explicitly (e.g., load the first 10,000 bytes and > search for <CAS> as the first XML element?) and maybe handle it. > > Make the load with non-null TSI input work for all formats (currently silently > ignored for xmi, xcas) > > WDYT? > > -Marshall > > > >
