[
https://issues.apache.org/jira/browse/NUTCH-2938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel resolved NUTCH-2938.
------------------------------------
Resolution: Won't Do
Closing - the any23 project has retired and the any23 plugin was removed from
Nutch (NUTCH-2998). See also the comment in the linked PR.
> Use Any23's RepositoryWriter to write structured data to Rdf4j repository
> -------------------------------------------------------------------------
>
> Key: NUTCH-2938
> URL: https://issues.apache.org/jira/browse/NUTCH-2938
> Project: Nutch
> Issue Type: Improvement
> Components: any23, plugin
> Reporter: Lewis John McGibbney
> Assignee: Lewis John McGibbney
> Priority: Major
>
> I have been running a patch which leverages [Any23's
> RepositoryWriter|https://any23.apache.org/apidocs/org/apache/any23/writer/RepositoryWriter.html]
> (implemented as one of a number of TripleHandler's via
> [CompositeTripleHandler|https://any23.apache.org/apidocs/org/apache/any23/writer/CompositeTripleHandler.html])
> to write Any23 extractions to
> [GraphDB|https://www.ontotext.com/products/graphdb/]. This enables us to
> build a content graph from data across the enterprise.
> This feature is turned off by default so will not change existing Any23
> behaviour. I have concerns about the performance of this patch because right
> now we need to create a new repository connection for each URL. This is not
> great so I will definitely improve on it.
> PR coming up.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)