[ 
https://issues.apache.org/jira/browse/ANY23-396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16615561#comment-16615561
 ] 

ASF GitHub Bot commented on ANY23-396:
--------------------------------------

Github user HansBrende commented on the issue:

    https://github.com/apache/any23/pull/122
  
    My last commit reflects the notes of interest I mentioned in my last 
comment.
    1. Since `WriterFactory.getMimeType()` is redundant and I had to deprecate 
it anyway to make this PR work, I've simply opted to *not* un-deprecate it in 
the extending `FormatWriterFactory`. To retrieve the MIME type of a 
`FormatWriterFactory` instance, simply call `getFormat().getDefaultMIMEType()`. 
However, to keep new implementations of `FormatWriterFactory` backwards 
compatible with the older behavior, I've simply added the following default 
implementation of `getMimeType()` in the `FormatWriterFactory` interface:
    
    ```java
    @Override
    @Deprecated
    default String getMimeType() {
        return getFormat().getDefaultMIMEType();
    }
    ```
    
    2. Since not all implementations of `FormatWriterFactory` print RDF triples 
(case in point: `URIListWriterFactory`), the deprecation of 
`WriterFactory.getRdfFormat()` presents us with the perfect opportunity to make 
the return type of `getRdfFormat()` more generic in `FormatWriterFactory` 
(namely, using `FileFormat`, the superclass of `RDFFormat`, instead of 
`RDFFormat`). To accomplish this, I've simply opted to *not* un-deprecate the 
`getRdfFormat()` method in the `FormatWriterFactory` interface, and instead, 
add the following method:
    
    ```java
    FileFormat getFormat();
    ```
    
    To keep everything backwards compatible with the previous behavior, I've 
added the following default implementation of `getRdfFormat()` to the 
`FormatWriterFactory` interface:
    
    ```java
    @Override
    @Deprecated
    default RDFFormat getRdfFormat() {
        FileFormat f = getFormat();
        if (f instanceof RDFFormat) {
            return (RDFFormat)f;
        } else {
            throw new UnsupportedOperationException("This class does not print 
RDF triples.");
        }
    }
    ```
    Now the `URIListWriterFactory` can utilize the method `getFormat()`, 
instead of its previous behavior of throwing a `RuntimeException`. To that 
effect, I've opted to return the following `FileFormat` from 
`URIListWriterFactory.getFormat()`:
    
    ```java
    private static final FileFormat FORMAT = new FileFormat("PLAINTEXT", 
"text/plain", 
                                               StandardCharsets.UTF_8, "txt");
    @Override
    public FileFormat getFormat() {
        return FORMAT;
    }
    ```
    
    3. Since the `FormatWriterFactory` interface is now not only tasked with 
`RDFFormat`s, but also arbitrary `FileFormat`s, deprecating the 
`WriterFactory.getRdfWriter(OutputStream)` method presents us with the perfect 
opportunity to choose a more appropriate name for this method in the 
subinterface `FormatWriterFactory`. To this effect, I've opted to simply *not* 
un-deprecate the `FormatWriterFactory.getRdfWriter(OutputStream)` method, and 
instead choose a more appropriate name. The name I've provisionally opted for 
is:
    
    ```java
    FormatWriter getFormatWriter(OutputStream);
    ```
    To keep everything backwards compatible, I've added the following default 
implementation of `FormatWriterFactory.getRdfWriter(OutputStream)`:
    
    ```java
    @Override
    @Deprecated
    default FormatWriter getRdfWriter(OutputStream os) {
        return getFormatWriter(os);
    }
    ```
    
    4. Finally, I have one further question for discussion:
    We could use this deprecation opportunity to further genericize the 
`FormatWriterFactory.getFormatWriter(OutputStream)` method, replacing:
    ```java
    FormatWriter getFormatWriter(OutputStream);
    ```
    with:
    ```java
    TripleHandler getWriter(OutputStream);
    ```
    which would allow `FormatWriterFactory` implementations to return arbitrary 
`TripleHandler`s instead of forcing them to return the more specific (but 
arguably *not* more useful) `FormatWriter` implementations. Where behavior 
specific to `FormatWriter` is actually needed, e.g. 
`FormatWriter.isAnnotated()` (a method which is actually *never* used anywhere 
in Any23), a check could be added as follows: 
    ```java
    boolean isAnnotated(TripleHandler writer) {
        return writer instanceof FormatWriter ? 
((FormatWriter)writer).isAnnotated() : false;
    }
    ```
    
    One additional benefit of doing this would be that 
`DelegatingWriterFactory` and `FormatWriterFactory` could both then extend some 
base interface as follows:
    ```
    interface BaseWriterFactory<Output> {
        TripleHandler getWriter(Output);
    }
    interface FormatWriterFactory extends BaseWriterFactory<OutputStream> {
        ...
    }
    interface DelegatingWriterFactory extends BaseWriterFactory<TripleHandler> {
        ...
    }
    ```
    
    @lewismc Any comments?


> Add ability to run extractors in flow
> -------------------------------------
>
>                 Key: ANY23-396
>                 URL: https://issues.apache.org/jira/browse/ANY23-396
>             Project: Apache Any23
>          Issue Type: Improvement
>          Components: core
>    Affects Versions: 2.2
>            Reporter: Jacek Grzebyta
>            Assignee: Jacek Grzebyta
>            Priority: Minor
>
> Currently extractors do not work in flows. I.E. Next extractor has no any 
> access to triples made by previous one.
> It would be useful if an extractor has possibility to modify triples created 
> by another extractor.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to