Thanks Alexandre. This looks like a good idea. Let me try this out!

*Samudra Banerjee*
First Year Graduate Student
Department of Computer Science
State University of New York
Stony Brook, NY 11790
631-496-6939

On 2/1/2014 8:30 PM, Alexandre Patry wrote:
On 14-02-01 08:22 PM, Samudra Banerjee wrote:
Hi Experts,

I have a scenario where processing a wikipedia XML dump generates a huge number of JCas objects (~1 million), one per page. I want to serialize these JCas objects for later use, but generating 1 million different files will take a toll on the system. So I was wondering if there was a way to serialize multiple JCas objects to a single file for later retrieval. Any idea if this can be achieved?
The JDK provide classes to read and write zip files (see http://docs.oracle.com/javase/7/docs/api/java/util/zip/package-summary.html). You could serialize each JCas in an entry of a zip file.

Best,

Alexandre


Reply via email to