An alternative could be serializing the XML content into a String and
saving it in a database or a fast key-value store.

I have used code like this:

            ByteArrayOutputStream out = new ByteArrayOutputStream(1024);
            XmiCasSerializer ser = new
XmiCasSerializer(cas.getTypeSystem());
            ser.serialize(cas.getCas(), (new XMLSerializer(out,
false)).getContentHandler());
            out.close();
            String xmlContent = out.toString();

Best,
Massimo



On Sun, Feb 2, 2014 at 4:22 AM, Samudra Banerjee <[email protected]> wrote:

> Hi Experts,
>
> I have a scenario where processing a wikipedia XML dump generates a huge
> number of JCas objects (~1 million), one per page. I want to serialize
> these JCas objects for later use, but generating 1 million different files
> will take a toll on the system. So I was wondering if there was a way to
> serialize multiple JCas objects to a single file for later retrieval. Any
> idea if this can be achieved?
>
> Thanks and Regards,
> Samudra
> --
>
> *Samudra Banerjee*
> First Year Graduate Student
> Department of Computer Science
> State University of New York
> Stony Brook, NY 11790
> 631-496-6939
>
>

Reply via email to