Thanks Alexandre. This looks like a good idea. Let me try this out!
*Samudra Banerjee*
First Year Graduate Student
Department of Computer Science
State University of New York
Stony Brook, NY 11790
631-496-6939
On 2/1/2014 8:30 PM, Alexandre Patry wrote:
On 14-02-01 08:22 PM, Samudra Banerjee wrote:
Hi Experts,
I have a scenario where processing a wikipedia XML dump generates a
huge number of JCas objects (~1 million), one per page. I want to
serialize these JCas objects for later use, but generating 1 million
different files will take a toll on the system. So I was wondering if
there was a way to serialize multiple JCas objects to a single file
for later retrieval. Any idea if this can be achieved?
The JDK provide classes to read and write zip files (see
http://docs.oracle.com/javase/7/docs/api/java/util/zip/package-summary.html).
You could serialize each JCas in an entry of a zip file.
Best,
Alexandre