[jira] [Commented] (NUTCH-2920) Implement a indexer-opensearch plugin
[ https://issues.apache.org/jira/browse/NUTCH-2920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17696868#comment-17696868 ] Hudson commented on NUTCH-2920: --- SUCCESS: Integrated in Jenkins build Nutch » Nutch-trunk #95 (See [https://ci-builds.apache.org/job/Nutch/job/Nutch-trunk/95/]) NUTCH-2920 -- first working attempt at migrating ElasticsearchIndexWriter to OpenSearch (snagel: [https://github.com/apache/nutch/commit/ca3824fd98290dd7806752decfab6eb9e3b3b569]) * (add) src/plugin/indexer-opensearch-1x/plugin.xml * (add) src/plugin/indexer-opensearch-1x/ivy.xml * (add) src/plugin/indexer-opensearch-1x/src/java/org/apache/nutch/indexwriter/opensearch1x/OpenSearch1xIndexWriter.java * (add) src/plugin/indexer-opensearch-1x/src/java/org/apache/nutch/indexwriter/opensearch1x/package-info.java * (add) src/plugin/indexer-opensearch-1x/src/java/org/apache/nutch/indexwriter/opensearch1x/OpenSearch1xConstants.java * (edit) src/plugin/build.xml * (edit) conf/index-writers.xml.template * (add) src/plugin/indexer-opensearch-1x/build.xml * (edit) LICENSE-binary * (add) src/plugin/indexer-opensearch-1x/build-ivy.xml * (add) src/plugin/indexer-opensearch-1x/README.md * (edit) NOTICE-binary NUTCH-2920 -- fix imports (snagel: [https://github.com/apache/nutch/commit/6e149f4954a0b7b21120b8e1467a07a82c60e66e]) * (edit) src/plugin/indexer-opensearch-1x/src/java/org/apache/nutch/indexwriter/opensearch1x/OpenSearch1xIndexWriter.java NUTCH-2920 -- add keystore for 2-way tls; add back in no-tls option with a stern warning and possibly helpful links. (snagel: [https://github.com/apache/nutch/commit/f6b17177ad6049b5642d9510cb60fe0a1d3b5f1c]) * (edit) src/plugin/indexer-opensearch-1x/src/java/org/apache/nutch/indexwriter/opensearch1x/OpenSearch1xConstants.java * (edit) src/plugin/indexer-opensearch-1x/src/java/org/apache/nutch/indexwriter/opensearch1x/OpenSearch1xIndexWriter.java NUTCH-2920 -- improve handling for missing trust.store.path in the index-writers.xml (snagel: [https://github.com/apache/nutch/commit/5fc2839c447a1b3695b4bcb507d428d32ff27281]) * (edit) src/plugin/indexer-opensearch-1x/src/java/org/apache/nutch/indexwriter/opensearch1x/OpenSearch1xIndexWriter.java NUTCH-2920 -- improve username/pw logic and update README.md (snagel: [https://github.com/apache/nutch/commit/71fabb2a87ff81b78997133ab7c790afa4ea6157]) * (edit) src/plugin/indexer-opensearch-1x/src/java/org/apache/nutch/indexwriter/opensearch1x/OpenSearch1xIndexWriter.java * (edit) src/plugin/indexer-opensearch-1x/README.md > Implement a indexer-opensearch plugin > - > > Key: NUTCH-2920 > URL: https://issues.apache.org/jira/browse/NUTCH-2920 > Project: Nutch > Issue Type: New Feature > Components: plugin >Affects Versions: 1.20 >Reporter: Lewis John McGibbney >Assignee: Lewis John McGibbney >Priority: Major > Fix For: 1.20 > > > We will be moving to AWS-managed OpenSearch in the near term and I would like > to index our content there. > As of writing the OpenSearch project has published two plugin versions under > thw Apache License v2 so far > https://github.com/opensearch-project/opensearch-java/ -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (NUTCH-2920) Implement a indexer-opensearch plugin
[ https://issues.apache.org/jira/browse/NUTCH-2920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17696856#comment-17696856 ] ASF GitHub Bot commented on NUTCH-2920: --- sebastian-nagel commented on PR #761: URL: https://github.com/apache/nutch/pull/761#issuecomment-1455893629 Merged. After reading about the [OpenSearch releases](https://opensearch.org/releases.html), I agree to include the version number in the plugin's name. It would also allow us to start working on a 2x plugin now. But maybe we keep naming the plugin supporting the latest version without using a specific version number? > Implement a indexer-opensearch plugin > - > > Key: NUTCH-2920 > URL: https://issues.apache.org/jira/browse/NUTCH-2920 > Project: Nutch > Issue Type: New Feature > Components: plugin >Reporter: Lewis John McGibbney >Assignee: Lewis John McGibbney >Priority: Major > Fix For: 1.20 > > > We will be moving to AWS-managed OpenSearch in the near term and I would like > to index our content there. > As of writing the OpenSearch project has published two plugin versions under > thw Apache License v2 so far > https://github.com/opensearch-project/opensearch-java/ -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (NUTCH-2920) Implement a indexer-opensearch plugin
[ https://issues.apache.org/jira/browse/NUTCH-2920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17696836#comment-17696836 ] ASF GitHub Bot commented on NUTCH-2920: --- sebastian-nagel merged PR #761: URL: https://github.com/apache/nutch/pull/761 > Implement a indexer-opensearch plugin > - > > Key: NUTCH-2920 > URL: https://issues.apache.org/jira/browse/NUTCH-2920 > Project: Nutch > Issue Type: New Feature > Components: plugin >Reporter: Lewis John McGibbney >Assignee: Lewis John McGibbney >Priority: Major > Fix For: 1.20 > > > We will be moving to AWS-managed OpenSearch in the near term and I would like > to index our content there. > As of writing the OpenSearch project has published two plugin versions under > thw Apache License v2 so far > https://github.com/opensearch-project/opensearch-java/ -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (NUTCH-2920) Implement a indexer-opensearch plugin
[ https://issues.apache.org/jira/browse/NUTCH-2920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17696325#comment-17696325 ] ASF GitHub Bot commented on NUTCH-2920: --- sebastian-nagel commented on PR #761: URL: https://github.com/apache/nutch/pull/761#issuecomment-1454129142 Well, don't really know. Depends on how long we want to maintain it - upgrading and testing (manually, as of now). And also, how long users may want to keep it running. For the Solr and Elastic indexers we always supported only one version (a recent one, but rarely the latest). > using testcontainers over on Tika to test integration with OpenSearch and Solr Same for [StormCrawler](https://github.com/Digitalpebble/storm-crawler/). Might be worth to have a look at the OpenSearch (2.5) module over there. > Implement a indexer-opensearch plugin > - > > Key: NUTCH-2920 > URL: https://issues.apache.org/jira/browse/NUTCH-2920 > Project: Nutch > Issue Type: New Feature > Components: plugin >Reporter: Lewis John McGibbney >Assignee: Lewis John McGibbney >Priority: Major > Fix For: 1.20 > > > We will be moving to AWS-managed OpenSearch in the near term and I would like > to index our content there. > As of writing the OpenSearch project has published two plugin versions under > thw Apache License v2 so far > https://github.com/opensearch-project/opensearch-java/ -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (NUTCH-2920) Implement a indexer-opensearch plugin
[ https://issues.apache.org/jira/browse/NUTCH-2920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17696320#comment-17696320 ] ASF GitHub Bot commented on NUTCH-2920: --- tballison commented on PR #761: URL: https://github.com/apache/nutch/pull/761#issuecomment-1454110886 Sorry, what I intended by my question was: are you all ok with version numbers in the module name. The current code is deprecated for 3x so I think we'll need to have a 3x at some point. Or, do we want to target, say 2.x as 'indexer-opensearch' and hope it supports 1.x, etc... > Implement a indexer-opensearch plugin > - > > Key: NUTCH-2920 > URL: https://issues.apache.org/jira/browse/NUTCH-2920 > Project: Nutch > Issue Type: New Feature > Components: plugin >Reporter: Lewis John McGibbney >Assignee: Lewis John McGibbney >Priority: Major > Fix For: 1.20 > > > We will be moving to AWS-managed OpenSearch in the near term and I would like > to index our content there. > As of writing the OpenSearch project has published two plugin versions under > thw Apache License v2 so far > https://github.com/opensearch-project/opensearch-java/ -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (NUTCH-2920) Implement a indexer-opensearch plugin
[ https://issues.apache.org/jira/browse/NUTCH-2920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17696318#comment-17696318 ] ASF GitHub Bot commented on NUTCH-2920: --- tballison commented on PR #761: URL: https://github.com/apache/nutch/pull/761#issuecomment-1454107912 I've run this with both the docker "getting started" OpenSearch example (e.g. `docker run -d -p 127.0.0.1:9200:9200 -p 127.0.0.1:9600:9600 -e "discovery.type=single-node" opensearchproject/opensearch:1.3.8` with username=admin and pw=admin) and with the example for running OpenSearch locally with the trust store in the [post above](https://opensearch.org/blog/connecting-java-high-level-rest-client-with-opensearch-over-https/) The more tests the merrier! We're using testcontainers over on Tika to test integration with OpenSearch and Solr. Resource and time intensive, but really, really valuable to have those. I can try to draft some testcontainers tests in a separate PR. > Implement a indexer-opensearch plugin > - > > Key: NUTCH-2920 > URL: https://issues.apache.org/jira/browse/NUTCH-2920 > Project: Nutch > Issue Type: New Feature > Components: plugin >Reporter: Lewis John McGibbney >Assignee: Lewis John McGibbney >Priority: Major > Fix For: 1.20 > > > We will be moving to AWS-managed OpenSearch in the near term and I would like > to index our content there. > As of writing the OpenSearch project has published two plugin versions under > thw Apache License v2 so far > https://github.com/opensearch-project/opensearch-java/ -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (NUTCH-2920) Implement a indexer-opensearch plugin
[ https://issues.apache.org/jira/browse/NUTCH-2920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17696315#comment-17696315 ] ASF GitHub Bot commented on NUTCH-2920: --- sebastian-nagel commented on PR #761: URL: https://github.com/apache/nutch/pull/761#issuecomment-145413 > ok with the *-1x design? Yes. As said, I haven't tested it with a running OpenSearch instance. If you can confirm that docs/fields appear in the index as expected, that's fine. Otherwise, I can try to run an OpenSearch instance (would be my first time), index a bulk of docs using various index filter plugins (add to the property plugin.includes `index-(basic|more|anchor|geoip)`). This would be a more realistic test scenario. > Implement a indexer-opensearch plugin > - > > Key: NUTCH-2920 > URL: https://issues.apache.org/jira/browse/NUTCH-2920 > Project: Nutch > Issue Type: New Feature > Components: plugin >Reporter: Lewis John McGibbney >Assignee: Lewis John McGibbney >Priority: Major > Fix For: 1.20 > > > We will be moving to AWS-managed OpenSearch in the near term and I would like > to index our content there. > As of writing the OpenSearch project has published two plugin versions under > thw Apache License v2 so far > https://github.com/opensearch-project/opensearch-java/ -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (NUTCH-2920) Implement a indexer-opensearch plugin
[ https://issues.apache.org/jira/browse/NUTCH-2920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17696296#comment-17696296 ] ASF GitHub Bot commented on NUTCH-2920: --- tballison commented on PR #761: URL: https://github.com/apache/nutch/pull/761#issuecomment-1454059036 To confirm, you're all ok with the *-1x design? Less than ideal, but I think it is useful. > Implement a indexer-opensearch plugin > - > > Key: NUTCH-2920 > URL: https://issues.apache.org/jira/browse/NUTCH-2920 > Project: Nutch > Issue Type: New Feature > Components: plugin >Reporter: Lewis John McGibbney >Assignee: Lewis John McGibbney >Priority: Major > Fix For: 1.20 > > > We will be moving to AWS-managed OpenSearch in the near term and I would like > to index our content there. > As of writing the OpenSearch project has published two plugin versions under > thw Apache License v2 so far > https://github.com/opensearch-project/opensearch-java/ -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (NUTCH-2920) Implement a indexer-opensearch plugin
[ https://issues.apache.org/jira/browse/NUTCH-2920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17696294#comment-17696294 ] ASF GitHub Bot commented on NUTCH-2920: --- tballison commented on code in PR #761: URL: https://github.com/apache/nutch/pull/761#discussion_r1124942828 ## NOTICE-binary: ## @@ -1021,6 +1021,10 @@ mapdb (http://www.mapdb.org) webarchive-commons (https://github.com/iipc/webarchive-commons) - license: The Apache Software License, Version 2.0 +# org.opensearch.client:opensearch-rest-high-level-client Review Comment: +1, nothing to do, tho, on this PR, right? > Implement a indexer-opensearch plugin > - > > Key: NUTCH-2920 > URL: https://issues.apache.org/jira/browse/NUTCH-2920 > Project: Nutch > Issue Type: New Feature > Components: plugin >Reporter: Lewis John McGibbney >Assignee: Lewis John McGibbney >Priority: Major > Fix For: 1.20 > > > We will be moving to AWS-managed OpenSearch in the near term and I would like > to index our content there. > As of writing the OpenSearch project has published two plugin versions under > thw Apache License v2 so far > https://github.com/opensearch-project/opensearch-java/ -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (NUTCH-2920) Implement a indexer-opensearch plugin
[ https://issues.apache.org/jira/browse/NUTCH-2920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17696293#comment-17696293 ] ASF GitHub Bot commented on NUTCH-2920: --- tballison commented on code in PR #761: URL: https://github.com/apache/nutch/pull/761#discussion_r1124942563 ## src/plugin/build.xml: ## @@ -54,6 +54,7 @@ + Review Comment: Done. > Implement a indexer-opensearch plugin > - > > Key: NUTCH-2920 > URL: https://issues.apache.org/jira/browse/NUTCH-2920 > Project: Nutch > Issue Type: New Feature > Components: plugin >Reporter: Lewis John McGibbney >Assignee: Lewis John McGibbney >Priority: Major > Fix For: 1.20 > > > We will be moving to AWS-managed OpenSearch in the near term and I would like > to index our content there. > As of writing the OpenSearch project has published two plugin versions under > thw Apache License v2 so far > https://github.com/opensearch-project/opensearch-java/ -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (NUTCH-2920) Implement a indexer-opensearch plugin
[ https://issues.apache.org/jira/browse/NUTCH-2920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17696284#comment-17696284 ] ASF GitHub Bot commented on NUTCH-2920: --- sebastian-nagel commented on code in PR #761: URL: https://github.com/apache/nutch/pull/761#discussion_r1124754751 ## src/plugin/build.xml: ## @@ -54,6 +54,7 @@ + Review Comment: (if `ant clean` is called in the folder `src/plugin`) In addition, the plugin should be added to the build.xml (targets javadoc, eclipse and release), same as other plugins. > Implement a indexer-opensearch plugin > - > > Key: NUTCH-2920 > URL: https://issues.apache.org/jira/browse/NUTCH-2920 > Project: Nutch > Issue Type: New Feature > Components: plugin >Reporter: Lewis John McGibbney >Assignee: Lewis John McGibbney >Priority: Major > Fix For: 1.20 > > > We will be moving to AWS-managed OpenSearch in the near term and I would like > to index our content there. > As of writing the OpenSearch project has published two plugin versions under > thw Apache License v2 so far > https://github.com/opensearch-project/opensearch-java/ -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (NUTCH-2920) Implement a indexer-opensearch plugin
[ https://issues.apache.org/jira/browse/NUTCH-2920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17696224#comment-17696224 ] ASF GitHub Bot commented on NUTCH-2920: --- sebastian-nagel commented on code in PR #761: URL: https://github.com/apache/nutch/pull/761#discussion_r1124705690 ## src/plugin/build.xml: ## @@ -54,6 +54,7 @@ + Review Comment: Should also add the new plugin to the target "clean". ## src/plugin/build.xml: ## @@ -54,6 +54,7 @@ + Review Comment: (if `ant clean` is called in the folder `src/plugin`) In addition, the plugin should be added to the build.xml (targets javadoc, elipse and release), same as other plugins. ## NOTICE-binary: ## @@ -1021,6 +1021,10 @@ mapdb (http://www.mapdb.org) webarchive-commons (https://github.com/iipc/webarchive-commons) - license: The Apache Software License, Version 2.0 +# org.opensearch.client:opensearch-rest-high-level-client Review Comment: Excellent! - but we should seriously automatize keeping the licenses up-to-date, see NUTCH-2981. > Implement a indexer-opensearch plugin > - > > Key: NUTCH-2920 > URL: https://issues.apache.org/jira/browse/NUTCH-2920 > Project: Nutch > Issue Type: New Feature > Components: plugin >Reporter: Lewis John McGibbney >Assignee: Lewis John McGibbney >Priority: Major > Fix For: 1.20 > > > We will be moving to AWS-managed OpenSearch in the near term and I would like > to index our content there. > As of writing the OpenSearch project has published two plugin versions under > thw Apache License v2 so far > https://github.com/opensearch-project/opensearch-java/ -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (NUTCH-2920) Implement a indexer-opensearch plugin
[ https://issues.apache.org/jira/browse/NUTCH-2920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17695267#comment-17695267 ] ASF GitHub Bot commented on NUTCH-2920: --- tballison commented on PR #761: URL: https://github.com/apache/nutch/pull/761#issuecomment-1450693830 K, I think this is ready for review. I'm happy for any and all input! > Implement a indexer-opensearch plugin > - > > Key: NUTCH-2920 > URL: https://issues.apache.org/jira/browse/NUTCH-2920 > Project: Nutch > Issue Type: New Feature > Components: plugin >Reporter: Lewis John McGibbney >Assignee: Lewis John McGibbney >Priority: Major > Fix For: 1.20 > > > We will be moving to AWS-managed OpenSearch in the near term and I would like > to index our content there. > As of writing the OpenSearch project has published two plugin versions under > thw Apache License v2 so far > https://github.com/opensearch-project/opensearch-java/ -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (NUTCH-2920) Implement a indexer-opensearch plugin
[ https://issues.apache.org/jira/browse/NUTCH-2920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17695152#comment-17695152 ] Tim Allison commented on NUTCH-2920: Current proposal is to go with the high level rest client for 1.x for now and cheer on the successful completion of https://github.com/opensearch-project/opensearch-java/issues/181. > Implement a indexer-opensearch plugin > - > > Key: NUTCH-2920 > URL: https://issues.apache.org/jira/browse/NUTCH-2920 > Project: Nutch > Issue Type: New Feature > Components: plugin >Reporter: Lewis John McGibbney >Assignee: Lewis John McGibbney >Priority: Major > Fix For: 1.20 > > > We will be moving to AWS-managed OpenSearch in the near term and I would like > to index our content there. > As of writing the OpenSearch project has published two plugin versions under > thw Apache License v2 so far > https://github.com/opensearch-project/opensearch-java/ -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (NUTCH-2920) Implement a indexer-opensearch plugin
[ https://issues.apache.org/jira/browse/NUTCH-2920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17695148#comment-17695148 ] Tim Allison commented on NUTCH-2920: Well, that was a funny notion... Turns out there is no BulkProcessor currently in the regular java-client (only exists in the high level java client) -- https://github.com/opensearch-project/opensearch-java/issues/181 So, we can make bulk requests with the basic java client, but we'd have to cache the bulk operations and have logic for when to run the operations. The BulkProcessor takes care of all of this and has triggers for when to send the bulk data (size or time) and has retry logic and some other useful things. This means that we'd have to reimplement that functionality, which I did on Tika ... and I don't want to do again. LOL... > Implement a indexer-opensearch plugin > - > > Key: NUTCH-2920 > URL: https://issues.apache.org/jira/browse/NUTCH-2920 > Project: Nutch > Issue Type: New Feature > Components: plugin >Reporter: Lewis John McGibbney >Assignee: Lewis John McGibbney >Priority: Major > Fix For: 1.20 > > > We will be moving to AWS-managed OpenSearch in the near term and I would like > to index our content there. > As of writing the OpenSearch project has published two plugin versions under > thw Apache License v2 so far > https://github.com/opensearch-project/opensearch-java/ -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (NUTCH-2920) Implement a indexer-opensearch plugin
[ https://issues.apache.org/jira/browse/NUTCH-2920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17695096#comment-17695096 ] Tim Allison commented on NUTCH-2920: My initial PR was a simple copy+paste with a few modifications of the ElasticsearchIndexWriter. Part of that was to make review easier, and part of that was that I saw that the lower level java rest client was in beta and that OpenSearch was recommending still using the high-level rest client (https://opensearch.org/docs/1.2/clients/java/). In thinking about this more, I realize that this "beta" message was for 1.2. It is gone in 1.3 (https://opensearch.org/docs/1.3/clients/java/). Further, the high level rest client is deprecated in 2.x and will be removed in 3.x. I'm going to rework the PR to use the more modern client. This will make migrating to 2.x easier and hopefully require far fewer dependencies in 1.x? > Implement a indexer-opensearch plugin > - > > Key: NUTCH-2920 > URL: https://issues.apache.org/jira/browse/NUTCH-2920 > Project: Nutch > Issue Type: New Feature > Components: plugin >Reporter: Lewis John McGibbney >Assignee: Lewis John McGibbney >Priority: Major > Fix For: 1.20 > > > We will be moving to AWS-managed OpenSearch in the near term and I would like > to index our content there. > As of writing the OpenSearch project has published two plugin versions under > thw Apache License v2 so far > https://github.com/opensearch-project/opensearch-java/ -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (NUTCH-2920) Implement a indexer-opensearch plugin
[ https://issues.apache.org/jira/browse/NUTCH-2920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17693459#comment-17693459 ] ASF GitHub Bot commented on NUTCH-2920: --- tballison commented on PR #761: URL: https://github.com/apache/nutch/pull/761#issuecomment-1445065255 Needs more work on tls vs basic auth etc. Converting to draft. Will update on Monday. > Implement a indexer-opensearch plugin > - > > Key: NUTCH-2920 > URL: https://issues.apache.org/jira/browse/NUTCH-2920 > Project: Nutch > Issue Type: New Feature > Components: plugin >Reporter: Lewis John McGibbney >Assignee: Lewis John McGibbney >Priority: Major > Fix For: 1.20 > > > We will be moving to AWS-managed OpenSearch in the near term and I would like > to index our content there. > As of writing the OpenSearch project has published two plugin versions under > thw Apache License v2 so far > https://github.com/opensearch-project/opensearch-java/ -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (NUTCH-2920) Implement a indexer-opensearch plugin
[ https://issues.apache.org/jira/browse/NUTCH-2920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17693335#comment-17693335 ] ASF GitHub Bot commented on NUTCH-2920: --- tballison commented on PR #761: URL: https://github.com/apache/nutch/pull/761#issuecomment-1444379112 I'm less than entirely thrilled with using stored strings for credentials, but that's where we were with Elasticsearch. Again, if there's a better way, please let me know. > Implement a indexer-opensearch plugin > - > > Key: NUTCH-2920 > URL: https://issues.apache.org/jira/browse/NUTCH-2920 > Project: Nutch > Issue Type: New Feature > Components: plugin >Reporter: Lewis John McGibbney >Assignee: Lewis John McGibbney >Priority: Major > Fix For: 1.20 > > > We will be moving to AWS-managed OpenSearch in the near term and I would like > to index our content there. > As of writing the OpenSearch project has published two plugin versions under > thw Apache License v2 so far > https://github.com/opensearch-project/opensearch-java/ -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (NUTCH-2920) Implement a indexer-opensearch plugin
[ https://issues.apache.org/jira/browse/NUTCH-2920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1769#comment-1769 ] ASF GitHub Bot commented on NUTCH-2920: --- tballison commented on PR #761: URL: https://github.com/apache/nutch/pull/761#issuecomment-1444377846 The fiddly part (for me) was setting up the rest client to deal with a trust store. I followed https://opensearch.org/blog/connecting-java-high-level-rest-client-with-opensearch-over-https/ I had to update the `keystore` command in that blog post like so: `keytool -importcert -file root_ca.der -keystore my-store.jks -storepass mystorepass -alias opensearch` If there's a better way of handling this, let me know. > Implement a indexer-opensearch plugin > - > > Key: NUTCH-2920 > URL: https://issues.apache.org/jira/browse/NUTCH-2920 > Project: Nutch > Issue Type: New Feature > Components: plugin >Reporter: Lewis John McGibbney >Assignee: Lewis John McGibbney >Priority: Major > Fix For: 1.20 > > > We will be moving to AWS-managed OpenSearch in the near term and I would like > to index our content there. > As of writing the OpenSearch project has published two plugin versions under > thw Apache License v2 so far > https://github.com/opensearch-project/opensearch-java/ -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (NUTCH-2920) Implement a indexer-opensearch plugin
[ https://issues.apache.org/jira/browse/NUTCH-2920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17693330#comment-17693330 ] ASF GitHub Bot commented on NUTCH-2920: --- tballison opened a new pull request, #761: URL: https://github.com/apache/nutch/pull/761 …iter to OpenSearch Thanks for your contribution to [Apache Nutch](https://nutch.apache.org/)! Your help is appreciated! Before opening the pull request, please verify that * there is an open issue on the [Nutch issue tracker](https://issues.apache.org/jira/projects/NUTCH) which describes the problem or the improvement. We cannot accept pull requests without an issue because the change wouldn't be listed in the release notes. * the issue ID (`NUTCH-2920`) - is referenced in the title of the pull request - and placed in front of your commit messages surrounded by square brackets (`[NUTCH-2920] Issue or pull request title`) * commits are squashed into a single one (or few commits for larger changes) * Java source code follows [Nutch Eclipse Code Formatting rules](https://github.com/apache/nutch/blob/master/eclipse-codeformat.xml) * Nutch is successfully built and unit tests pass by running `ant clean runtime test` * there should be no conflicts when merging the pull request branch into the *recent* master branch. If there are conflicts, please try to rebase the pull request branch on top of a freshly pulled master branch. * if new dependencies are added, - are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](https://www.apache.org/legal/resolved.html#category-a)? - are `LICENSE-binary` and `NOTICE-binary` updated accordingly? We will be able to faster integrate your pull request if these conditions are met. If you have any questions how to fix your problem or about using Nutch in general, please sign up for the [Nutch mailing list](https://nutch.apache.org/mailing_lists.html). Thanks! > Implement a indexer-opensearch plugin > - > > Key: NUTCH-2920 > URL: https://issues.apache.org/jira/browse/NUTCH-2920 > Project: Nutch > Issue Type: New Feature > Components: plugin >Reporter: Lewis John McGibbney >Assignee: Lewis John McGibbney >Priority: Major > Fix For: 1.20 > > > We will be moving to AWS-managed OpenSearch in the near term and I would like > to index our content there. > As of writing the OpenSearch project has published two plugin versions under > thw Apache License v2 so far > https://github.com/opensearch-project/opensearch-java/ -- This message was sent by Atlassian Jira (v8.20.10#820010)