[ https://issues.apache.org/jira/browse/CONNECTORS-1563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16772827#comment-16772827 ]
Subasini Rath commented on CONNECTORS-1563: ------------------------------------------- Hi Shinichiro, My requirement is not to crawl a file system. My requirement is to crawl a website. That is the reason I am using Web repository. ** - Also could you please let me know when manifold interacts with Solr , which field does it write the actual content of document without any metadata. Didn't get your point on Simple history. My solr log doesnot show any error. Manifold log is as follows : ====== ERROR 2019-02-18T21:19:25,484 (qtp1619356001-411) - Missing resource bundle 'org.apache.manifoldcf.agents.output.solr.common' for locale 'en': Can't find bundle for base name org.apache.manifoldcf.agents.output.solr.common, locale en; trying en_US java.util.MissingResourceException: Can't find bundle for base name org.apache.manifoldcf.agents.output.solr.common, locale en at java.util.ResourceBundle.throwMissingResourceException(ResourceBundle.java:1573) ~[?:1.8.0_181] at java.util.ResourceBundle.getBundleImpl(ResourceBundle.java:1396) ~[?:1.8.0_181] at java.util.ResourceBundle.getBundle(ResourceBundle.java:1091) ~[?:1.8.0_181] at org.apache.manifoldcf.core.i18n.Messages.getResourceBundle(Messages.java:142) [mcf-core.jar:?] at org.apache.manifoldcf.core.i18n.Messages.getMessage(Messages.java:178) [mcf-core.jar:?] at org.apache.manifoldcf.core.i18n.Messages.getString(Messages.java:216) [mcf-core.jar:?] at org.apache.manifoldcf.agents.output.solr.Messages.getString(Messages.java:91) [mcf-solr-connector.jar:?] at org.apache.manifoldcf.agents.output.solr.Messages.getString(Messages.java:39) [mcf-solr-connector.jar:?] at org.apache.manifoldcf.agents.output.solr.SolrConnector.outputConfigurationHeader(SolrConnector.java:637) [mcf-solr-connector.jar:?] at org.apache.manifoldcf.core.interfaces.ConnectorFactory.outputThisConfigurationHeader(ConnectorFactory.java:71) [mcf-core.jar:?] at org.apache.manifoldcf.agents.interfaces.OutputConnectorFactory.outputConfigurationHeader(OutputConnectorFactory.java:98) [mcf-agents.jar:?] at org.apache.jsp.editoutput_jsp._jspService(editoutput_jsp.java:423) [jsp/:?] at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:70) [jasper-6.0.35.jar:6.0.35] at javax.servlet.http.HttpServlet.service(HttpServlet.java:790) [javax.servlet-api-3.1.0.jar:3.1.0] at org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:388) [jasper-6.0.35.jar:6.0.35] at org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:313) [jasper-6.0.35.jar:6.0.35] at org.apache.jasper.servlet.JspServlet.service(JspServlet.java:260) [jasper-6.0.35.jar:6.0.35] at javax.servlet.http.HttpServlet.service(HttpServlet.java:790) [javax.servlet-api-3.1.0.jar:3.1.0] at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:769) [jetty-servlet-9.2.3.v20140905.jar:9.2.3.v20140905] at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585) [jetty-servlet-9.2.3.v20140905.jar:9.2.3.v20140905] at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143) [jetty-server-9.2.3.v20140905.jar:9.2.3.v20140905] at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:577) [jetty-security-9.2.3.v20140905.jar:9.2.3.v20140905] at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:223) [jetty-server-9.2.3.v20140905.jar:9.2.3.v20140905] at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1125) [jetty-server-9.2.3.v20140905.jar:9.2.3.v20140905] at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515) [jetty-servlet-9.2.3.v20140905.jar:9.2.3.v20140905] at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185) [jetty-server-9.2.3.v20140905.jar:9.2.3.v20140905] at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1059) [jetty-server-9.2.3.v20140905.jar:9.2.3.v20140905] at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) [jetty-server-9.2.3.v20140905.jar:9.2.3.v20140905] at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:215) [jetty-server-9.2.3.v20140905.jar:9.2.3.v20140905] at org.eclipse.jetty.server.handler.HandlerList.handle(HandlerList.java:52) [jetty-server-9.2.3.v20140905.jar:9.2.3.v20140905] at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97) [jetty-server-9.2.3.v20140905.jar:9.2.3.v20140905] at org.eclipse.jetty.server.Server.handle(Server.java:497) [jetty-server-9.2.3.v20140905.jar:9.2.3.v20140905] at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:311) [jetty-server-9.2.3.v20140905.jar:9.2.3.v20140905] at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:248) [jetty-server-9.2.3.v20140905.jar:9.2.3.v20140905] at org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:540) [jetty-io-9.2.3.v20140905.jar:9.2.3.v20140905] at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:610) [jetty-util-9.2.3.v20140905.jar:9.2.3.v20140905] at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:539) [jetty-util-9.2.3.v20140905.jar:9.2.3.v20140905] at java.lang.Thread.run(Thread.java:748) [?:1.8.0_181] Thanks & Regards, Subasini Rath O: +91-33 6636-8889 M: +91 983-1234-341 Email: subasini.r...@endeavourenergy.com.au > SolrException: org.apache.tika.exception.ZeroByteFileException: InputStream > must have > 0 bytes > ----------------------------------------------------------------------------------------------- > > Key: CONNECTORS-1563 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1563 > Project: ManifoldCF > Issue Type: Task > Components: Lucene/SOLR connector > Reporter: Sneha > Assignee: Karl Wright > Priority: Major > Attachments: Document simple history.docx, Manifold and Solr > settings_CustomField.docx, managed-schema, manifold settings.docx, > manifoldcf.log, path.png, schema.png, solr.log, solrconfig.xml > > > I am encountering this problem: > I have checked "Use the Extract Update Handler:" param then I am getting an > error on Solr i.e. null:org.apache.solr.common.SolrException: > org.apache.tika.exception.ZeroByteFileException: InputStream must have > 0 > bytes > If I ignore tika exception, my documents get indexed but dont have content > field on Solr. > I am using Solr 7.3.1 and manifoldCF 2.8.1 > I am using solr cell and hence not configured external tika extractor in > manifoldCF pipeline > Please help me with this problem > Thanks in advance -- This message was sent by Atlassian JIRA (v7.6.3#76005)