Both are technically possible Please bear in mind that the resulting outputs may be exceptionally verbose because generating either of those formats in a streaming fashion that will prevent you from using many of the available syntactic sugars. In the case of JSON-LD we don’t maintain the core functionality ourselves so you would need to provide contributions upstream to the third-party library we use. In the RDF/XML case you can contribute directly to Jena
Contributions are always welcome As an aside what is the value of producing such large data sets in those formats? There is a reason that the community has standardised on other more compact formats for large scale data exchange Rob On 15/06/2017 22:58, "Erman Korkut (BLOOMBERG/ 120 PARK)" <[email protected]> wrote: Hi all, We are using riot to convert large nt files into turtle, so far it works great thanks to streaming support for turtle. For a 500 million triple file, it does it in order of minutes, without running into any memory issue. We are interested in converting to rdf/xml and jsonld formats in a similar fashion with streaming. These formats do not seem to support streaming at the moment. I saw Andy's response to a question in stack overflow saying that "JSON-LD is not a streaming output language (the writer needs all the data available before calling the jsonld-java code)" (https://stackoverflow.com/questions/26287432/json-ld-in-jena-riot) Is it conceptually impossible to support rdf/xml and json-ld in riot with streaming? This looks like it can be made to work, particularly when the input file is turtle where each subject is already grouped for its predicates/objects. We are willing to work on this patch and contribute it back to the Jena but wanted to check with you first to see what you think. Is it really impossible or would it be really take very significant effort in the current codebase? Please let me know what you think on this patch idea. Thanks, Erman Korkut Bloomberg L.P. [email protected]
