On 14-02-01 08:22 PM, Samudra Banerjee wrote:
Hi Experts,

I have a scenario where processing a wikipedia XML dump generates a huge number of JCas objects (~1 million), one per page. I want to serialize these JCas objects for later use, but generating 1 million different files will take a toll on the system. So I was wondering if there was a way to serialize multiple JCas objects to a single file for later retrieval. Any idea if this can be achieved?
The JDK provide classes to read and write zip files (see http://docs.oracle.com/javase/7/docs/api/java/util/zip/package-summary.html). You could serialize each JCas in an entry of a zip file.

Best,

Alexandre

--
Alexandre Patry, Ph.D
Chercheur / Researcher
http://KeaText.com

Reply via email to