Hello,

I am working on a project in Spain that uses Solr 7.7.2 and we are facing a
strange behavior everytime we try to index some files that are under a path
that contains special characters.

On the solr logs, we see an error saying that the file does not exist, but
actually it does when we check it out using linux "ls" command or even
using a simple java class it works fine.

I have tried adding JVM arguments on Solr and in our application to try to
set the encode to UTF-8 or ISO-8859-1 (-Dfile.encoding=UTF-8
-Dsun.jnu.encoding=UTF-8), but the error persists.

Has anyone faced something similar?

2021-05-13 15:48:05.996 ERROR (qtp634540230-22) [c:default s:shard1
r:core_node1 x:default] o.a.s.s.HttpSolrCall
null:java.io.FileNotFoundException:  /files/Formación/Documentacion de
cursos/Sistemas de Información/dummy.pdf   (No existe el fichero o el
directorio)
        at java.io.FileInputStream.open0(Native Method)
        at java.io.FileInputStream.open(FileInputStream.java:195)
        at java.io.FileInputStream.<init>(FileInputStream.java:138)
*        at
org.apache.solr.common.util.ContentStreamBase$FileStream.getStream(ContentStreamBase.java:190)*
        at
org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:162)
        at
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:68)
        at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:199)
        at org.apache.solr.core.SolrCore.execute(SolrCore.java:2551)
        at
org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:711)
        at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:516)
        at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:395)
        at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:341)
        at
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1602)
        at
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:540)
        at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:146)
        at
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
        at
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
        at
org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:257)
        at
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1588)
        at
org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:255)
        at
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1345)
        at
org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:203)
        at
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:480)
        at
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1557)
        at
org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:201)
        at
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1247)
        at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:144)
        at
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:220)
        at
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:126)
        at
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
        at
org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
        at
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
        at org.eclipse.jetty.server.Server.handle(Server.java:502)
        at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:364)
        at
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:260)
        at
org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:305)
        at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103)
        at
org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:118)
        at
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:333)
        at
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:310)
        at
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:168)
        at
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:126)
        at
org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:366)
        at
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:765)
        at
org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:683)
        at java.lang.Thread.run(Thread.java:748)


# TestSolrFile.java
import java.util.*;
import java.io.*;

public class TestSolrFile {
        public static void main(String args[]) throws IOException {

                InputStream input = new FileInputStream(args[0]);

                int data = input.read();
                while(data != -1) {
                        data = input.read();
                }
                input.close();
        }
}

java -cp . TestSolrFile "/files/Formación/Documentacion de cursos/Sistemas
de Información/dummy.pdf"
*-- there is no errors*

server:/home/user1> ls -ltr "/files/Formación/Documentacion de
cursos/Sistemas de Información/dummy.pdf"
-rw-r-----. 1 user1 user1 738 may  2 00:20 /files/Formación/Documentacion
de cursos/Sistemas de Información/dummy.pdf


Thanks in advance

-- 
*Remo Martins Furlanetto*
*E-mail:* *remofurlane...@gmail.com <remofurlane...@gmail.com>*
*Telefone:* +34 611 644 314

Reply via email to