On 14-02-01 08:22 PM, Samudra Banerjee wrote:
Hi Experts,
I have a scenario where processing a wikipedia XML dump generates a
huge number of JCas objects (~1 million), one per page. I want to
serialize these JCas objects for later use, but generating 1 million
different files will take a toll on the system. So I was wondering if
there was a way to serialize multiple JCas objects to a single file
for later retrieval. Any idea if this can be achieved?
The JDK provide classes to read and write zip files (see
http://docs.oracle.com/javase/7/docs/api/java/util/zip/package-summary.html).
You could serialize each JCas in an entry of a zip file.
Best,
Alexandre
--
Alexandre Patry, Ph.D
Chercheur / Researcher
http://KeaText.com