[ 
https://issues.apache.org/jira/browse/CONNECTORS-1563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16772827#comment-16772827
 ] 

Subasini Rath commented on CONNECTORS-1563:
-------------------------------------------

Hi Shinichiro,
   My requirement is not to crawl a file system.
My requirement is to crawl a website. That is the reason I am using Web 
repository.

** - Also could you please let me know when manifold interacts with Solr , 
which field does it write the actual content of document without any metadata.


Didn't get your point on Simple history.
My solr log doesnot show any error.
Manifold log is as follows : 

======
ERROR 2019-02-18T21:19:25,484 (qtp1619356001-411) - Missing resource bundle 
'org.apache.manifoldcf.agents.output.solr.common' for locale 'en': Can't find 
bundle for base name org.apache.manifoldcf.agents.output.solr.common, locale 
en; trying en_US
java.util.MissingResourceException: Can't find bundle for base name 
org.apache.manifoldcf.agents.output.solr.common, locale en
        at 
java.util.ResourceBundle.throwMissingResourceException(ResourceBundle.java:1573)
 ~[?:1.8.0_181]
        at java.util.ResourceBundle.getBundleImpl(ResourceBundle.java:1396) 
~[?:1.8.0_181]
        at java.util.ResourceBundle.getBundle(ResourceBundle.java:1091) 
~[?:1.8.0_181]
        at 
org.apache.manifoldcf.core.i18n.Messages.getResourceBundle(Messages.java:142) 
[mcf-core.jar:?]
        at 
org.apache.manifoldcf.core.i18n.Messages.getMessage(Messages.java:178) 
[mcf-core.jar:?]
        at 
org.apache.manifoldcf.core.i18n.Messages.getString(Messages.java:216) 
[mcf-core.jar:?]
        at 
org.apache.manifoldcf.agents.output.solr.Messages.getString(Messages.java:91) 
[mcf-solr-connector.jar:?]
        at 
org.apache.manifoldcf.agents.output.solr.Messages.getString(Messages.java:39) 
[mcf-solr-connector.jar:?]
        at 
org.apache.manifoldcf.agents.output.solr.SolrConnector.outputConfigurationHeader(SolrConnector.java:637)
 [mcf-solr-connector.jar:?]
        at 
org.apache.manifoldcf.core.interfaces.ConnectorFactory.outputThisConfigurationHeader(ConnectorFactory.java:71)
 [mcf-core.jar:?]
        at 
org.apache.manifoldcf.agents.interfaces.OutputConnectorFactory.outputConfigurationHeader(OutputConnectorFactory.java:98)
 [mcf-agents.jar:?]
        at org.apache.jsp.editoutput_jsp._jspService(editoutput_jsp.java:423) 
[jsp/:?]
        at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:70) 
[jasper-6.0.35.jar:6.0.35]
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:790) 
[javax.servlet-api-3.1.0.jar:3.1.0]
        at 
org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:388) 
[jasper-6.0.35.jar:6.0.35]
        at 
org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:313) 
[jasper-6.0.35.jar:6.0.35]
        at org.apache.jasper.servlet.JspServlet.service(JspServlet.java:260) 
[jasper-6.0.35.jar:6.0.35]
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:790) 
[javax.servlet-api-3.1.0.jar:3.1.0]
        at 
org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:769) 
[jetty-servlet-9.2.3.v20140905.jar:9.2.3.v20140905]
        at 
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585) 
[jetty-servlet-9.2.3.v20140905.jar:9.2.3.v20140905]
        at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143) 
[jetty-server-9.2.3.v20140905.jar:9.2.3.v20140905]
        at 
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:577) 
[jetty-security-9.2.3.v20140905.jar:9.2.3.v20140905]
        at 
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:223)
 [jetty-server-9.2.3.v20140905.jar:9.2.3.v20140905]
        at 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1125)
 [jetty-server-9.2.3.v20140905.jar:9.2.3.v20140905]
        at 
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515) 
[jetty-servlet-9.2.3.v20140905.jar:9.2.3.v20140905]
        at 
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
 [jetty-server-9.2.3.v20140905.jar:9.2.3.v20140905]
        at 
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1059)
 [jetty-server-9.2.3.v20140905.jar:9.2.3.v20140905]
        at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) 
[jetty-server-9.2.3.v20140905.jar:9.2.3.v20140905]
        at 
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:215)
 [jetty-server-9.2.3.v20140905.jar:9.2.3.v20140905]
        at 
org.eclipse.jetty.server.handler.HandlerList.handle(HandlerList.java:52) 
[jetty-server-9.2.3.v20140905.jar:9.2.3.v20140905]
        at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97) 
[jetty-server-9.2.3.v20140905.jar:9.2.3.v20140905]
        at org.eclipse.jetty.server.Server.handle(Server.java:497) 
[jetty-server-9.2.3.v20140905.jar:9.2.3.v20140905]
        at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:311) 
[jetty-server-9.2.3.v20140905.jar:9.2.3.v20140905]
        at 
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:248) 
[jetty-server-9.2.3.v20140905.jar:9.2.3.v20140905]
        at 
org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:540) 
[jetty-io-9.2.3.v20140905.jar:9.2.3.v20140905]
        at 
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:610)
 [jetty-util-9.2.3.v20140905.jar:9.2.3.v20140905]
        at 
org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:539) 
[jetty-util-9.2.3.v20140905.jar:9.2.3.v20140905]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_181]





Thanks & Regards,
Subasini Rath
O: +91-33 6636-8889 
M: +91 983-1234-341
Email: subasini.r...@endeavourenergy.com.au



> SolrException: org.apache.tika.exception.ZeroByteFileException: InputStream 
> must have > 0 bytes
> -----------------------------------------------------------------------------------------------
>
>                 Key: CONNECTORS-1563
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-1563
>             Project: ManifoldCF
>          Issue Type: Task
>          Components: Lucene/SOLR connector
>            Reporter: Sneha
>            Assignee: Karl Wright
>            Priority: Major
>         Attachments: Document simple history.docx, Manifold and Solr 
> settings_CustomField.docx, managed-schema, manifold settings.docx, 
> manifoldcf.log, path.png, schema.png, solr.log, solrconfig.xml
>
>
> I am encountering this problem:
> I have checked "Use the Extract Update Handler:" param then I am getting an 
> error on Solr i.e. null:org.apache.solr.common.SolrException: 
> org.apache.tika.exception.ZeroByteFileException: InputStream must have > 0 
> bytes
> If I ignore tika exception, my documents get indexed but dont have content 
> field on Solr.
> I am using Solr 7.3.1 and manifoldCF 2.8.1
> I am using solr cell and hence not configured external tika extractor in 
> manifoldCF pipeline
> Please help me with this problem
> Thanks in advance



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to