[
https://issues.apache.org/jira/browse/NUTCH-1933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14307539#comment-14307539
]
Mo Omer commented on NUTCH-1933:
--------------------------------
That's really cool to hear; I'll check out that link Lewis. As my employer no
longer has the client for whom the project (a sort of contextual tagging
service which derived html content via Nutch) was built, I haven't touched or
thought of this in a while. A couple months ago, though, I found myself
wondering if there are any better solutions available.
Have you all evaluated WebEngine
(http://docs.oracle.com/javase/8/javafx/api/javafx/scene/web/WebEngine.html)?
Or setting up some sort of dom inside v8 and calling C funcs from Java?
One small additional note: the nutch-selenium plugin should also allow the
time-delay (basically the time allowed for the page to render - including ajax
etc.) to be configured.
> nutch-selenium plugin
> ---------------------
>
> Key: NUTCH-1933
> URL: https://issues.apache.org/jira/browse/NUTCH-1933
> Project: Nutch
> Issue Type: Bug
> Components: protocol
> Reporter: Mo Omer
> Assignee: Lewis John McGibbney
> Fix For: 1.10
>
> Attachments: NUTCH-selenium-trunk.patch
>
>
> I updated the plugin [nutch-selenium|https://github.com/momer/nutch-selenium]
> plugin to run against trunk.
> I feel that there is a good bit of work to be done here however early testing
> on my system are that it works.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)