Yasunori,

It should be possible to pass the InputStream for the tar entry contents directly to the RDFParserBuilder.source, no need to convert to a string first.

IIRC TarArchiveInputStream is a bit weird - it signals "end of file" at the end of the tar archive entry, the the app moves to the next entry and the input stream is then for that entry and can be passed to a new RDFParserBuilder call.

An RDFParser does not close an inputStream it is passed.

It will need a new RDFParser for each entry.

If that is now hat is happened, please let us know.

    Andy


On 06/08/2019 23:31, Yasunori Yamamoto wrote:
Files in a tar are in RDF/XML or Turtle.

Yasunori

2019/08/07 3:11、ajs6f <aj...@apache.org>のメール:

In what format are these RDF files?

ajs6f

On Aug 6, 2019, at 10:05 AM, Yasunori Yamamoto <y...@dbcls.rois.ac.jp> wrote:

Hello, I'm trying to learn how to parse RDF data archived in a tar.gz
file (e.g., rdfdatasets.tar.gz that contains a set of RDF data files)
within my Java program.
The following code does work properly, but it is inefficient because
the process reads and loads the entire RDF data in an entry of the
given tar.gz file into a main memory before parsing.
So, could you please let me know a better way to save a memory space ?

TarArchiveInputStream tarInput = new TarArchiveInputStream(new
GzipCompressorInputStream(new FileInputStream(filename)));
TarArchiveEntry currentEntry;
PipedRDFIterator<Triple> iter = new
PipedRDFIterator<Triple>(buffersize, false, pollTimeout, maxPolls);
final PipedRDFStream<Triple> inputStream = new PipedTriplesStream(iter);

while ((currentEntry = tarInput.getNextTarEntry()) != null) {
String currentFile = currentEntry.getName();
Lang lang = RDFLanguages.filenameToLang(currentFile);
parser_object = RDFParserBuilder
   .create()
   .errorHandler(ErrorHandlerFactory.errorHandlerDetailed())
   .source(new StringReader(CharStreams.toString(new
InputStreamReader(tarInput))))
   .checking(checking)
   .lang(lang)
   .build();
parser_object.parse(inputStream);
}
tarInput.close();

Sincerely yours,
Yasunori Yamamoto

Reply via email to