[ 
https://issues.apache.org/jira/browse/NUTCH-3130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18036534#comment-18036534
 ] 

ASF GitHub Bot commented on NUTCH-3130:
---------------------------------------

sebastian-nagel commented on code in PR #869:
URL: https://github.com/apache/nutch/pull/869#discussion_r2507086490


##########
src/java/org/apache/nutch/plugin/PluginRepository.java:
##########
@@ -98,13 +101,22 @@ public PluginRepository(Configuration conf) throws 
RuntimeException {
     try {
       installExtensions(this.fRegisteredPlugins);
     } catch (PluginRuntimeException e) {
-      LOG.error("Could not install extensions.", e.toString());
+      LOG.error("Could not install extensions. {}", e.toString());

Review Comment:
   +1
   
   Or: `LOG.error("Could not install extensions:", e);`



##########
src/java/org/apache/nutch/plugin/Plugin.java:
##########
@@ -88,9 +88,4 @@ public PluginDescriptor getDescriptor() {
   private void setDescriptor(PluginDescriptor descriptor) {
     fDescriptor = descriptor;
   }
-
-  @Override
-  protected void finalize() throws Throwable {
-    shutDown();

Review Comment:
   Same for me.



##########
src/java/org/apache/nutch/indexer/IndexWriters.java:
##########
@@ -211,7 +211,7 @@ private Collection<String> getIndexWriters(NutchDocument 
doc) {
   public void open(Configuration conf, String name) throws IOException {
     for (Map.Entry<String, IndexWriterWrapper> entry : this.indexWriters
         .entrySet()) {
-      entry.getValue().getIndexWriter().open(conf, name);
+      entry.getValue().getIndexWriter().open(new IndexWriterParams(new 
HashMap<>()));

Review Comment:
   Yes.



##########
src/java/org/apache/nutch/metadata/SpellCheckedMetadata.java:
##########
@@ -115,7 +115,7 @@ public static String getNormalizedName(final String name) {
     if ((value == null) && (normalized != null)) {
       int threshold = Math.min(3, searched.length() / TRESHOLD_DIVIDER);
       for (int i = 0; i < normalized.length && value == null; i++) {
-        if (StringUtils.getLevenshteinDistance(searched, normalized[i]) < 
threshold) {
+        if (StringUtils.compareIgnoreCase(searched, normalized[i]) < 
threshold) { //.getLevenshteinDistance(searched, normalized[i]) < threshold) {

Review Comment:
   `SpellCheckedMetadata` is used only by protocol-http and 
protocol-httpclient. We could deprecate it, use `CaseInsensitiveMetadata` 
instead (see NUTCH-3002) and later remove the class `SpellCheckedMetadata` 
entirely. Nowadays, spell-checking HTTP headers sounds odd, while 20 years ago 
it might have been a good idea.
   
   Changing the behavior in opposite to the name does not seem the right way.
   
   If we want to keep the class, we need to use 
[LevenshteinDistance](https://commons.apache.org/proper/commons-text/javadocs/api-release/org/apache/commons/text/similarity/LevenshteinDistance.html).



##########
src/java/org/apache/nutch/indexer/IndexWriter.java:
##########
@@ -30,15 +30,6 @@ public interface IndexWriter extends Pluggable, Configurable 
{
    */
   final static String X_POINT_ID = IndexWriter.class.getName();
 
-  /**
-   * @param conf Nutch configuration
-   * @param name target name of the {@link IndexWriter} to be opened
-   * @throws IOException Some exception thrown by some writer.
-   * @deprecated use {@link #open(IndexWriterParams)}} instead.  
-   */
-  @Deprecated

Review Comment:
   Yes, it's ok. It has been deprecated since 2018 with the release of 1.15.
   
   We might add a release note about removed deprecations for 1.21





> Address deprecated API usage across Nutch codebase and build
> ------------------------------------------------------------
>
>                 Key: NUTCH-3130
>                 URL: https://issues.apache.org/jira/browse/NUTCH-3130
>             Project: Nutch
>          Issue Type: Improvement
>          Components: build, ci/cd, dependency
>    Affects Versions: 1.21
>            Reporter: Lewis John McGibbney
>            Assignee: Lewis John McGibbney
>            Priority: Major
>             Fix For: 1.22
>
>
> A long time ago I performed a similar task 
> (https://issues.apache.org/jira/browse/NUTCH-1273) to address all deprecation 
> warnings flagged across the Nutch codebase.
> This time around I want to do the same but also plan to include a deprecation 
> check as part of GitHub CI so we keep on top of deprecation issues into the 
> future.
> Patch coming up.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to