[ 
https://issues.apache.org/jira/browse/ANY23-396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16661209#comment-16661209
 ] 

Hudson commented on ANY23-396:
------------------------------

FAILURE: Integrated in Jenkins build Any23-trunk #1626 (See 
[https://builds.apache.org/job/Any23-trunk/1626/])
ANY23-396 Overhaul WriterFactory API (hans: rev 
692c583f848c5b7ae5a7940c857bfb0a9542c0d5)
* (edit) core/src/main/java/org/apache/any23/writer/TurtleWriter.java
* (edit) core/src/main/java/org/apache/any23/writer/URIListWriter.java
* (edit) core/src/main/java/org/apache/any23/writer/URIListWriterFactory.java
* (edit) cli/src/main/java/org/apache/any23/cli/Rover.java
* (add) test-resources/src/test/resources/cli/basic-with-stylesheet.html
* (edit) core/src/main/java/org/apache/any23/writer/JSONLDWriterFactory.java
* (add) api/src/test/java/org/apache/any23/configuration/SettingsTest.java
* (edit) core/src/main/java/org/apache/any23/writer/package-info.java
* (edit) core/src/main/java/org/apache/any23/writer/JSONLDWriter.java
* (add) api/src/main/java/org/apache/any23/writer/DecoratingWriterFactory.java
* (add) 
cli/src/test/resources/META-INF/services/org.apache.any23.writer.WriterFactory
* (edit) cli/src/test/java/org/apache/any23/cli/RoverTest.java
* (add) core/src/main/java/org/apache/any23/writer/WriterSettings.java
* (add) cli/src/test/java/org/apache/any23/cli/flows/PeopleExtractor.java
* (add) api/src/main/java/org/apache/any23/writer/TripleFormat.java
* (edit) core/src/main/java/org/apache/any23/writer/RDFWriterTripleHandler.java
* (edit) api/src/main/java/org/apache/any23/writer/WriterFactory.java
* (edit) core/src/main/java/org/apache/any23/writer/JSONWriterFactory.java
* (add) cli/src/test/java/org/apache/any23/cli/flows/PeopleExtractorFactory.java
* (edit) core/src/main/java/org/apache/any23/writer/NTriplesWriterFactory.java
* (add) api/src/main/java/org/apache/any23/writer/TripleWriterFactory.java
* (edit) core/src/main/java/org/apache/any23/writer/NQuadsWriter.java
* (edit) core/src/test/java/org/apache/any23/writer/JSONWriterTest.java
* (add) api/src/test/java/org/apache/any23/writer/TripleFormatTest.java
* (edit) service/src/main/java/org/apache/any23/servlet/WebResponder.java
* (add) core/src/main/java/org/apache/any23/writer/TripleWriterHandler.java
* (edit) core/src/test/java/org/apache/any23/writer/WriterRegistryTest.java
* (edit) api/src/main/java/org/apache/any23/writer/WriterFactoryRegistry.java
* (edit) core/src/main/java/org/apache/any23/writer/NTriplesWriter.java
* (edit) core/src/main/java/org/apache/any23/writer/RDFXMLWriter.java
* (edit) core/src/main/java/org/apache/any23/writer/TriXWriter.java
* (add) api/src/main/java/org/apache/any23/configuration/Setting.java
* (add) api/src/main/java/org/apache/any23/writer/TripleWriter.java
* (edit) core/src/main/java/org/apache/any23/writer/RDFXMLWriterFactory.java
* (add) cli/src/test/java/org/apache/any23/cli/ExtractorsFlowTest.java
* (edit) core/src/main/java/org/apache/any23/writer/TriXWriterFactory.java
* (edit) api/pom.xml
* (edit) core/src/main/java/org/apache/any23/writer/TurtleWriterFactory.java
* (edit) core/src/main/java/org/apache/any23/writer/NQuadsWriterFactory.java
* (add) api/src/main/java/org/apache/any23/configuration/Settings.java
* (edit) core/src/main/java/org/apache/any23/writer/JSONWriter.java


> Overhaul WriterFactory API
> --------------------------
>
>                 Key: ANY23-396
>                 URL: https://issues.apache.org/jira/browse/ANY23-396
>             Project: Apache Any23
>          Issue Type: Improvement
>          Components: core
>    Affects Versions: 2.2
>            Reporter: Jacek Grzebyta
>            Assignee: Hans Brende
>            Priority: Major
>             Fix For: 2.3
>
>
> This issue began with Jacek's observation that, in Rover, it is impossible to 
> specify a *delegating writer factory*, i.e., one that maps/filters/reduces 
> the preliminary extraction output before passing it on to the final 
> outputstream writer. Lack of this ability caused us to have to specify 
> numerous configuration flags in Rover, e.g., "--notrivial", which filters the 
> output of the extractor by removing trivial css triples prior to writing the 
> triples to their final format. Many of these flags could simply be replaced 
> by the ids of *delegating writer factories*, if we had such a capability. One 
> added advantage of that would be that then, users could specify the *order* 
> in which these modifications take place. E.g., adding a *logging* decorator 
> could take place before or after the "notrivial" decorator has been applied 
> (or both before *and* after!). Which? If we can, we should really let the 
> user decide. 
> The most obvious solution to this problem was to extend the {{WriterFactory}} 
> interface with a new {{DelegatingWriterFactory}} interface that accepts an 
> arbitrary {{TripleHandler}} rather than an {{OutputStream}} as input. 
> In doing so, it was also necessary to deprecate a few methods in 
> {{WriterFactory}} and un-deprecate them in an extending 
> {{TripleWriterFactory}} class (which takes the place of {{WriterFactory}} by 
> creating a {{TripleHandler}} from an {{OutputStream}}). This deprecation was 
> actually not too painful, first, because some of the methods were redundant 
> in the first place (e.g., {{getMimeType()}}), and second, because it 
> presented us with a perfect opportunity to add some much-needed improvements 
> to the new interface.
> The biggest improvement is the addition of {{Settings}} as a parameter to the 
> {{TripleHandler}} constructor, which will allow users to configure writers as 
> they see fit, rather than forcing, e.g., {{prettyprint=true}} on them.
> ANY23-388 perfectly illustrates this current lack of configuration ability. 
> And we fixed that issue by simply giving users {{protected}} access to the 
> underlying {{RDFWriter}} instances so that they could configure them 
> manually. However, in hindsight, this was a bad idea, as it could lead to 
> backwards compatibility issues down the line if we decide to change the 
> underlying implementation of {{RDFWriterTripleHandler}} instances. Luckily, 
> the solution to ANY23-388 was only implemented recently and is still only 
> present in the snapshot version of Any23. In my PR, I've removed that hack 
> and replaced it with {{Settings}}, which is extensible ad infinitum and won't 
> pose the same threat to backwards compatibility. 
> Another improvement is the removal of RDF4J classes from the public 
> WriterFactory API. (I replaced {{RDFFormat}} with our own {{TripleFormat}} 
> class.) As I noted in my PR, it's probably better for us to use our own 
> classes in public-facing interfaces rather than RDF4J's so that we can 
> maintain stability in the event that RDF4J changes their API, or (heaven 
> forbid) ceases to exist, or we simply want to modify the implementation. A 
> good rule of thumb for us would probably be to limit usage of RDF4J in our 
> public-facing API to the ubiquitous interfaces found in the 
> {{org.eclipse.rdf4j:rdf4j-model}} artifact (e.g. {{IRI}} and {{Literal}}), 
> since removing those would be virtually impossible without enormous backwards 
> compatibility issues.
> Since this PR is quite large and there are a multitude of new classes and new 
> behaviors (while managing to remain fully backwards-compatible with previous 
> behavior), I'm looking for feedback! Please comment with any concerns, 
> questions, or suggestions you have for improvement. 
> PR can be viewed here: https://github.com/apache/any23/pull/122
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to