The RIOT writers divide up into three kinds: "pretty", "blocks" and
"flat" where the concepts are applicable.
"Pretty" is the expensive pretty printing form which si also
non-streamable. For both RDF/XML and TTL, it is unsuitable for
perforance predicable applications. Several parts of pretty printing
are a search over the data and it's multiple pass.
"Flat" is one triple -- for Turtle that means N-triples+prefixes+literal
short forms on one line.
"Block" is grouping same-subjects together where they occur adjacent in
the input stream. It is a streaming process and has predicable
performance. Because stored RDF is going to be streamed out by subject
(all the storage layers happen to do that), it equates to blocks of
triples with the same subject.
http://jena.apache.org/documentation/io/rdf-output.html#streamed-block-formats
RDFXML_PRETTY is RDFXML-ABBREV.
RDFXML_FLAT is RDFXML.
There isn't an RDFXML_BLOCKS; there the machinery to write one though.
The RDFXML-ABBREV writer might be tunable to do this but to ensure
streaming, and hence predicable performance, it is probably just as easy
to write one as the machinery and fraemwork already exists:
The Turtle one is "WriterStreamRDFBlocks"
https://github.com/apache/jena/blob/master/jena-arq%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fjena%2Friot%2Fwriter%2FWriterStreamRDFBlocks.java
which is extending "WriterStreamRDFBatched"
https://github.com/apache/jena/blob/master/jena-arq%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fjena%2Friot%2Fwriter%2FWriterStreamRDFBatched.java
so one file, not huge.
Andy
Tony - the correct way to reference Jena or any of the components is
with the trademarks "Apache Jena" or "Apache Jena <module>".
On 17/11/14 16:19, Dave Reynolds wrote:
Hi Tony,
Yes, I understood this was a different question. My point is that your
example of what output you need seems to be pretty close to the basic
writer with no expansion of linked resources at all and the only obvious
difference being the in-lining of types. Taking the basic writer and
adding type in-lining would be a much easier job than extending the
ABBREV writer.
For controlling the the ABBREV writer expansions I think there is only
the blockRules (can't check, site is down for me from here) and it
sounded like you had tried all of those already.
Understood that you have found some thrashing behaviour which may be a
bug. It is certainly very possible to get exponential slow down when
dealing with these very circular structures. I'm not aware of that being
reported before for ABBREV but then I hardly ever come across RDF/XML
these days :)
Dave
On 17/11/14 15:36, Hammond, Tony wrote:
Hi Dave:
Actually this is a different problem from my initial query (which was
indeed about multiple typed nodes and selecting a preferred form - we'll
take that one away).
This is a completely separate issue and relates to the way that the
RDF/XML-ABBREV serializer works.
As noted we have encountered a problem where because of the inline
expansion of object nodes on first sight then when we have both
skos:broader AND skos:narrower terms we seem to experience some kind of
thrashing behaviour. The serialization does not execute in any acceptable
time (with 90 mins being on the steep side).
So, it seems (to us, at least) that the serializer is broken.
Removing the skos:narrower terms we do have acceptable serialization
times
- but we still don't know if it's possible to control the node
expansions.
Am assuming not.
Cheers,
Tony
On 17/11/2014 15:21, "Dave Reynolds" <[email protected]> wrote:
Hi Tony,
If what you actually need is the basic writer plus in-lining of
(selected) RDF types then you could consider developing your own.
If I recall correctly the RDF/XML basic (non-ABBREV) writer is actually
very simple and easy to specialize by inheritance. Then you could
contribute your extended writer back to the project.
Dave
On 17/11/14 15:03, Hammond, Tony wrote:
Hi Martynas:
We want the RDF/XML-ABBREV format for RDF typed nodes.
We'd prefer not getting inline expansion as the XML is then more
regular
and easier to query with XML tooling. (For the use case see our ISWC
2014
presentation here [1]. We're modelling in RDF and querying in XML.
We're
chasing single-digit millisecond structured responses.)
As also noted in my post for *some* reason there we have a MAJOR
problem
with the RDF/XML-ABBREV format when we have both skos:broader and
skos:narrower terms.
Yes, we could resort to XSLT afterwards but this means a lot of
additional
work which we would prefer to avoid. There would also be a maintenance
issue here.
The Jena RDF/XML-ABBREV output is good - but not good enough. We don't
know why we can't control it better.
Cheers,
Tony
[1]
http://www.slideshare.net/tonyh/iswc-2014hammondpasinpresentationfinal
On 17/11/2014 14:45, "Martynas Jusevičius" <[email protected]>
wrote:
So what is the difference between Jena's RDF/XML and Rapper's RDF/XML?
The only thing I can see that Rapper inlines RDF type and Jena uses
rdf:Description. That can be easily fixed using XML tools such as
XSLT.
The question is, why do you need such specific output?
On Mon, Nov 17, 2014 at 3:39 PM, Hammond, Tony
<[email protected]> wrote:
Hi Martynas:
As said:
The basic RDF/XML output from Jena is not what we need. We just need
a
better, more regular, RDF/XML-ABBREV such as the rapper output.
Cheers,
Tony
On 17/11/2014 14:35, "Martynas Jusevičius" <[email protected]>
wrote:
Tony,
have you tried dropping the -ABBREV and simply using RDF/XML?
Martynas
graphityhq.com
On Mon, Nov 17, 2014 at 3:26 PM, Hammond, Tony
<[email protected]> wrote:
Hi:
I've got another couple questions about RDF/XML-ABBREV format.
When using the RDF/XML-ABBREV serializer to output a pretty-printed
graph on our subject taxonomy but we get object nodes expanded
inline
on
their first mention. See example output below [1] where I've shown
only
the URI references for convenience and suppressed the literal
properties.
This makes for a very uneven XML. (I know that the RDF is fine
though.
But we want some regular XML.)
Also, and much more importantly, if we add in skos:narrower links
this
then creates some kind of pathological behaviour and the
serialization
then takes some **90 mins** or more instead of seconds or less.
(The
graph size is about 90,000 triples with ~3,000 subjects.)
We wanted to know if there was any way to suppress node
expansion in
the Jena RDF/XML-ABBREV output as we'd really like to use Java for
portability in our build workflows.
We tried various "blockRules" property combinations to control the
RDF/XML output but to no avail - see Scala code here.
==
val out = new FileOutputStream(file)
try {
// model.write(out, format )
val writer = model.getWriter(format);
if (format.equals("RDF/XML-ABBREV")) {
writer.setProperty("blockRules", "propertyAttr");
}
writer.write(model, out, null);
} finally {
out.close()
}
==
However, if we output the same graph in TURTLE and then use an
external
tool - the Redland rapper utility which also supports
RDF/XML-ABBREV -
we get the very regular (and decidedly pretty) flat RDF/XML output
as
shown in [2].
Is there any way we can control this expansion better in Jena or do
we
need to use an external tool like rapper?
The basic RDF/XML output from Jena is not what we need. We just
need a
better, more regular, RDF/XML-ABBREV such as the rapper output.
Thanks,
Tony
[1]
<npg:Subject
rdf:about="http://ns.nature.com/subjects/periodontitis">
<skos:broader>
<npg:Subject
rdf:about="http://ns.nature.com/subjects/periodontics">
<skos:broader>
<npg:Subject
rdf:about="http://ns.nature.com/subjects/dentistry">
<skos:broader>
<npg:Subject
rdf:about="http://ns.nature.com/subjects/health-care">
<skos:related>
<npg:Subject
rdf:about="http://ns.nature.com/subjects/health-services">
<skos:broader
rdf:resource="http://ns.nature.com/subjects/health-care"/>
</npg:Subject>
</skos:related>
<skos:related>
</npg:Subject>
</skos:related>
</npg:Subject>
</skos:broader>
</npg:Subject>
</skos:broader>
</npg:Subject>
</skos:broader>
<skos:broader>
<npg:Subject
rdf:about="http://ns.nature.com/subjects/oral-diseases">
<skos:broader>
<skos:broader
rdf:resource="http://ns.nature.com/subjects/health-sciences"/>
</npg:Subject>
</skos:broader>
</npg:Subject>
</skos:broader>
</npg:Subject>
[2]
<npg:Subject
rdf:about="http://ns.nature.com/subjects/periodontitis">
<skos:broader
rdf:resource="http://ns.nature.com/subjects/oral-diseases"/>
<skos:broader
rdf:resource="http://ns.nature.com/subjects/periodontics"/>
</npg:Subject>
<npg:Subject
rdf:about="http://ns.nature.com/subjects/oral-diseases">
...
</npg:Subject>
<npg:Subject
rdf:about="http://ns.nature.com/subjects/periodontics">
...
</npg:Subject>
*********************************************************************
**
**
*******
DISCLAIMER: This e-mail is confidential and should not be used by
anyone who is
not the original intended recipient. If you have received this
e-mail
in error
please inform the sender and delete it from your mailbox or any
other
storage
mechanism. Neither Macmillan Publishers Limited nor any of its
agents
accept
liability for any statements made which are clearly the sender's
own
and not
expressly made on behalf of Macmillan Publishers Limited or one of
its
agents.
Please note that neither Macmillan Publishers Limited nor any of
its
agents
accept any responsibility for viruses that may be contained in this
e-mail or
its attachments and it is your responsibility to scan the e-mail
and
attachments (if any). No contracts may be concluded on behalf of
Macmillan
Publishers Limited or its agents by means of e-mail communication.
Macmillan
Publishers Limited Registered in England and Wales with registered
number 785998
Registered Office Brunel Road, Houndmills, Basingstoke RG21 6XS
*********************************************************************
**
**
*******
***********************************************************************
**
*******
DISCLAIMER: This e-mail is confidential and should not be used by
anyone who is
not the original intended recipient. If you have received this e-mail
in error
please inform the sender and delete it from your mailbox or any other
storage
mechanism. Neither Macmillan Publishers Limited nor any of its agents
accept
liability for any statements made which are clearly the sender's own
and not
expressly made on behalf of Macmillan Publishers Limited or one of
its
agents.
Please note that neither Macmillan Publishers Limited nor any of its
agents
accept any responsibility for viruses that may be contained in this
e-mail or
its attachments and it is your responsibility to scan the e-mail and
attachments (if any). No contracts may be concluded on behalf of
Macmillan
Publishers Limited or its agents by means of e-mail communication.
Macmillan
Publishers Limited Registered in England and Wales with registered
number 785998
Registered Office Brunel Road, Houndmills, Basingstoke RG21 6XS
***********************************************************************
**
*******
*************************************************************************
*******
DISCLAIMER: This e-mail is confidential and should not be used by
anyone who is
not the original intended recipient. If you have received this e-mail
in error
please inform the sender and delete it from your mailbox or any other
storage
mechanism. Neither Macmillan Publishers Limited nor any of its agents
accept
liability for any statements made which are clearly the sender's own
and not
expressly made on behalf of Macmillan Publishers Limited or one of its
agents.
Please note that neither Macmillan Publishers Limited nor any of its
agents
accept any responsibility for viruses that may be contained in this
e-mail or
its attachments and it is your responsibility to scan the e-mail and
attachments (if any). No contracts may be concluded on behalf of
Macmillan
Publishers Limited or its agents by means of e-mail communication.
Macmillan
Publishers Limited Registered in England and Wales with registered
number 785998
Registered Office Brunel Road, Houndmills, Basingstoke RG21 6XS
*************************************************************************
*******
********************************************************************************
DISCLAIMER: This e-mail is confidential and should not be used by
anyone who is
not the original intended recipient. If you have received this e-mail
in error
please inform the sender and delete it from your mailbox or any other
storage
mechanism. Neither Macmillan Publishers Limited nor any of its agents
accept
liability for any statements made which are clearly the sender's own
and not
expressly made on behalf of Macmillan Publishers Limited or one of its
agents.
Please note that neither Macmillan Publishers Limited nor any of its
agents
accept any responsibility for viruses that may be contained in this
e-mail or
its attachments and it is your responsibility to scan the e-mail and
attachments (if any). No contracts may be concluded on behalf of
Macmillan
Publishers Limited or its agents by means of e-mail communication.
Macmillan
Publishers Limited Registered in England and Wales with registered
number 785998
Registered Office Brunel Road, Houndmills, Basingstoke RG21 6XS
********************************************************************************