[
https://issues.apache.org/jira/browse/ANY23-396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16616997#comment-16616997
]
ASF GitHub Bot commented on ANY23-396:
--------------------------------------
Github user HansBrende commented on the issue:
https://github.com/apache/any23/pull/122
Thank you, @lewismc .
The only item of concern left from my perspective is *naming*. Should any
of the new `public` interfaces/methods I have created be named differently, or
are they adequately descriptive as they currently stand? This decision should
be made now, as there is no going back.
Here follows the names of all the new `public` methods/interfaces I have
created in this PR:
1. `public interface `**`FormatWriterFactory`**
> Is this descriptive enough? It does specify a `FileFormat getFormat()`
method, returning the format which will be written to the output stream, so the
name does still make sense even though we now return a `TripleHandler` rather
than a `FormatWriter` from the `getTripleWriter(OutputStream)` method. On the
other hand, we could also call it `ContentWriterFactory` in line with the
existing `ContentExtractor` interface (although I'm not sure if that would make
it any more descriptive). Another possibility would be
`OutputStreamWriterFactory`.
2. `public interface`**`DelegatingWriterFactory`**
> Alternatives include `CompositeWriterFactory` or `FilterWriterFactory`
(similar to `java.io.FilterOutputStream`).
3. `TripleHandler`**`getTripleWriter(Output)`** (specified in the
`BaseWriterFactory<Output>` interface)
> Alternatives include `getTripleHandler` or simply `getWriter`. I chose
`getTripleWriter` over `getWriter` because it seemed more descriptive, and to
avoid confusion with the `java.io.Writer` class.
4. `TripleHandler`**`getWriter(id, output)`** and
`TripleHandler`**`getDefaultWriter(OutputStream)`** (specified in
`WriterFactoryRegistry`).
> This one is confused by the fact that `WriterFactoryRegistry` already
uses the term "writer" to refer to *`WriterFactory`* instances (e.g.
`List<WriterFactory> getWriters()` and `WriterFactory
getWriterByIdentifier(String id)`). An easy alternative would be to take a hint
from the existing, now-deprecated method `FormatWriter
getWriterInstanceByIdentifier(id, output)` and use "**writerInstance**" to
refer to a triple handler, i.e., `TripleHandler getWriterInstance(id, output)`
and `getDefaultWriterInstance(OutputStream)`. Alternatively, we could use
`getTripleWriter(id, output)` and `getDefaultTripleWriter(OutputStream)`.
Any suggestions, or better names that I haven't thought of, @lewismc ?
@jgrzebyta ?
> Add ability to run extractors in flow
> -------------------------------------
>
> Key: ANY23-396
> URL: https://issues.apache.org/jira/browse/ANY23-396
> Project: Apache Any23
> Issue Type: Improvement
> Components: core
> Affects Versions: 2.2
> Reporter: Jacek Grzebyta
> Assignee: Jacek Grzebyta
> Priority: Minor
>
> Currently extractors do not work in flows. I.E. Next extractor has no any
> access to triples made by previous one.
> It would be useful if an extractor has possibility to modify triples created
> by another extractor.
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)