Hi, any tips on how to solve the problem? K
On Tue, Dec 16, 2014 at 05:06:33PM +0100, Kamil Żyta wrote: > All *.7z files causes a problem. Example in attachment. > > K > > On Tue, Dec 16, 2014 at 10:20:24AM -0500, Karl Wright wrote: > > Hi Kamil, > > > > If it happens again, see if you can find an archive file that it happens > > on. It's easy to do that: you just want to drop the file you suspect down > > in the file system somewhere, and set up a file system job to crawl that > > one file, making sure you send it through the Tika transformer of course. > > You can use a null output connection. > > > > If you can reproduce the problem with just that one file, then if you send > > me the file I can work with it here and determine whether the problem is > > local to your system or is a more general issue. > > > > Thanks, > > Karl > > > > > > On Tue, Dec 16, 2014 at 9:55 AM, Kamil Żyta <[email protected]> wrote: > > > > > > err, only a few jobs causes a problem (the rest probably does not have > > > archives). > > > I don't know which files you ask. > > > > > > K > > > > > > > > > On Tue, Dec 16, 2014 at 03:15:16PM +0100, Kamil Żyta wrote: > > > > Ok, I rebuilt mcf and the problem still was so I restart all jobs and no > > > problem. > > > > Thx Karl for your time. > > > > > > > > K > > > > > > > > On Tue, Dec 16, 2014 at 07:28:34AM -0500, Karl Wright wrote: > > > > > The commons-compress code just makes a simple reference to the Coder > > > class, > > > > > with no reflection or anything suspicious going on. > > > > > > > > > > If you can send me the binary file that causes this issue, I can > > > > > verify > > > > > whether it happens here or not. (If this seems to happen for you on > > > ALL > > > > > files, then I can already assure you that it does not happen here, and > > > > > you've got something very special happening in your environment.) If > > > this > > > > > is not reproducible here then you probably have a corrupt > > > commons-compress > > > > > jar and need to download it again. If it *is* reproducible, then > > > probably > > > > > we will need to create an Oracle Java bug ticket. > > > > > > > > > > Karl > > > > > > > > > > > > > > > > > > > > On Tue, Dec 16, 2014 at 7:17 AM, Karl Wright <[email protected]> > > > wrote: > > > > > > > > > > > > Ok, then I don't understand it. It may be a bug in > > > commons-compress, or > > > > > > maybe even a bad jar. I'll have a look at their code and see if I > > > can > > > > > > figure out why that class won't load. > > > > > > > > > > > > Karl > > > > > > > > > > > > > > > > > > On Tue, Dec 16, 2014 at 7:06 AM, Kamil Żyta <[email protected]> > > > wrote: > > > > > >> > > > > > >> I use multiprocess-zk-example, external pgsql db and splited war. > > > > > >> > > > > > >> K > > > > > >> > > > > > >> On Tue, Dec 16, 2014 at 07:02:47AM -0500, Karl Wright wrote: > > > > > >> > Hi Kamil, > > > > > >> > > > > > > >> > Which example are you using? is this with the combined war, or > > > is it > > > > > >> one > > > > > >> > of the multiprocess examples, or is it the single-process quick > > > start? > > > > > >> > > > > > > >> > I really don't have any idea why a class that IS found in a > > > particular > > > > > >> jar > > > > > >> > cannot in turn find another class in the same jar, so I'll need > > > as many > > > > > >> > details as possible. > > > > > >> > > > > > > >> > Karl > > > > > >> > > > > > > >> > > > > > > >> > On Tue, Dec 16, 2014 at 6:54 AM, Kamil Żyta < > > > [email protected]> > > > > > >> wrote: > > > > > >> > > > > > > > >> > > > find . -iname 'commons-compress*' > > > > > >> > > ./lib/commons-compress-1.8.1.jar > > > > > >> > > ./dist/lib/commons-compress-1.8.1.jar > > > > > >> > > ./framework/dist/lib/commons-compress-1.8.1.jar > > > > > >> > > > > > > > >> > > > > > > > >> > > > ./framework/build/webapp/crawler-ui-proprietary/WEB-INF/lib/commons-compress-1.8.1.jar > > > > > >> > > > > > > > >> > > > > > > > >> > > > ./framework/build/webapp/combined-service-proprietary/WEB-INF/lib/commons-compress-1.8.1.jar > > > > > >> > > > > > > > >> > > > > > > > >> > > > ./framework/build/webapp/authority-service/WEB-INF/lib/commons-compress-1.8.1.jar > > > > > >> > > > > > > > >> > > > ./framework/build/webapp/crawler-ui/WEB-INF/lib/commons-compress-1.8.1.jar > > > > > >> > > > > > > > >> > > > ./framework/build/webapp/api-service/WEB-INF/lib/commons-compress-1.8.1.jar > > > > > >> > > > > > > > >> > > > > > > > >> > > > ./framework/build/webapp/combined-service/WEB-INF/lib/commons-compress-1.8.1.jar > > > > > >> > > > > > > > >> > > > > > > > >> > > > ./framework/build/webapp/authority-service-proprietary/WEB-INF/lib/commons-compress-1.8.1.jar > > > > > >> > > > > > > > >> > > > > > > > >> > > > ./framework/build/webapp/api-service-proprietary/WEB-INF/lib/commons-compress-1.8.1.jar > > > > > >> > > > > > > > >> > > I follow > > > > > >> > > > > > > > >> > > > https://manifoldcf.apache.org/release/trunk/en_US/how-to-build-and-deploy.html#Building+the+framework+and+the+connectors+using+Apache+Ant > > > > > >> > > There aren't anything about clean-core-deps. I checkout fresh > > > > > >> > > release-1.8-branch. > > > > > >> > > > > > > > >> > > K > > > > > >> > > > > > > > >> > > On Tue, Dec 16, 2014 at 06:38:18AM -0500, Karl Wright wrote: > > > > > >> > > > Hi Kamil, > > > > > >> > > > > > > > > >> > > > I've confirmed that this should not be a classloader issue. > > > The > > > > > >> class in > > > > > >> > > > question is in commons-compress.jar at the root level (under > > > > > >> dist/lib). > > > > > >> > > > The only way this would not be loadable is if you had TWO > > > > > >> > > commons-compress > > > > > >> > > > jars in your lib area. This is possible if you upgraded to > > > mcf 1.8 > > > > > >> and > > > > > >> > > did > > > > > >> > > > not do a make clean-core-deps before you did a > > > > > >> > > > make-core-deps, > > > > > >> because > > > > > >> > > now > > > > > >> > > > all jars have versions attached to their names. > > > > > >> > > > > > > > > >> > > > Please confirm you do not have duplicate jars in this > > > directory. > > > > > >> > > > > > > > > >> > > > Thanks, > > > > > >> > > > Karl > > > > > >> > > > > > > > > >> > > > > > > > > >> > > > On Tue, Dec 16, 2014 at 6:31 AM, Karl Wright < > > > [email protected]> > > > > > >> wrote: > > > > > >> > > > > > > > > > >> > > > > Hi Kamil, > > > > > >> > > > > > > > > > >> > > > > Your problem looks like a potential classloader issue. Let > > > me do > > > > > >> some > > > > > >> > > > > research and get back to you. > > > > > >> > > > > > > > > > >> > > > > Karl > > > > > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > >> > > > > On Tue, Dec 16, 2014 at 5:34 AM, Kamil Żyta < > > > > > >> [email protected]> > > > > > >> > > wrote: > > > > > >> > > > >> > > > > > >> > > > >> thx Karl but now I have new issue: > > > > > >> > > > >> > > > > > >> > > > >> FATAL 2014-12-16 11:12:58,496 (Worker thread '47') - Error > > > > > >> tossed: > > > > > >> > > Could > > > > > >> > > > >> not initialize class > > > > > >> > > org.apache.commons.compress.archivers.sevenz.Coders > > > > > >> > > > >> java.lang.NoClassDefFoundError: Could not initialize class > > > > > >> > > > >> org.apache.commons.compress.archivers.sevenz.Coders > > > > > >> > > > >> at > > > > > >> > > > >> > > > > > >> > > > > > > > >> > > > org.apache.commons.compress.archivers.sevenz.SevenZFile.readEncodedHeader(SevenZFile.java:279) > > > > > >> > > > >> at > > > > > >> > > > >> > > > > > >> > > > > > > > >> > > > org.apache.commons.compress.archivers.sevenz.SevenZFile.readHeaders(SevenZFile.java:191) > > > > > >> > > > >> at > > > > > >> > > > >> > > > > > >> > > > > > > > >> > > > org.apache.commons.compress.archivers.sevenz.SevenZFile.<init>(SevenZFile.java:95) > > > > > >> > > > >> at > > > > > >> > > > >> > > > > > >> > > > > > > > >> > > > org.apache.commons.compress.archivers.sevenz.SevenZFile.<init>(SevenZFile.java:117) > > > > > >> > > > >> at > > > > > >> > > > >> > > > > > >> > > > org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:130) > > > > > >> > > > >> at > > > > > >> > > > >> > > > > > >> > > > org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:244) > > > > > >> > > > >> at > > > > > >> > > > >> > > > > > >> > > > org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:244) > > > > > >> > > > >> at > > > > > >> > > > >> > > > > > >> > > > > > > > >> > > > org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:121) > > > > > >> > > > >> at > > > > > >> > > > >> > > > > > >> > > > > > > > >> > > > org.apache.manifoldcf.agents.transformation.tika.TikaExtractor.addOrReplaceDocumentWithException(TikaExtractor.java:230) > > > > > >> > > > >> at > > > > > >> > > > >> > > > > > >> > > > > > > > >> > > > org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester$PipelineAddEntryPoint.addOrReplaceDocumentWithException(IncrementalIngester.java:3257) > > > > > >> > > > >> at > > > > > >> > > > >> > > > > > >> > > > > > > > >> > > > org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester$PipelineAddFanout.sendDocument(IncrementalIngester.java:3108) > > > > > >> > > > >> at > > > > > >> > > > >> > > > > > >> > > > > > > > >> > > > org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester$PipelineObjectWithVersions.addOrReplaceDocumentWithException(IncrementalIngester.java:2739) > > > > > >> > > > >> at > > > > > >> > > > >> > > > > > >> > > > > > > > >> > > > org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentIngest(IncrementalIngester.java:792) > > > > > >> > > > >> at > > > > > >> > > > >> > > > > > >> > > > > > > > >> > > > org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocumentWithException(WorkerThread.java:1610) > > > > > >> > > > >> at > > > > > >> > > > >> > > > > > >> > > > > > > > >> > > > org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocumentWithException(WorkerThread.java:1558) > > > > > >> > > > >> at > > > > > >> > > > >> > > > > > >> > > > > > > > >> > > > org.apache.manifoldcf.crawler.connectors.sharedrive.SharedDriveConnector.processDocuments(SharedDriveConnector.java:911) > > > > > >> > > > >> at > > > > > >> > > > >> > > > > > >> > > > > > > > >> > > > org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:383) > > > > > >> > > > >> > > > > > >> > > > >> And another question: I use Solr 4.10 with Tika 1.5. MCF > > > 1.8 > > > > > >> have tika > > > > > >> > > > >> 1.6. How this affect document parsing? > > > > > >> > > > >> > > > > > >> > > > >> K > > > > > >> > > > >> > > > > > >> > > > >> On Mon, Dec 15, 2014 at 08:45:31AM -0500, Karl Wright > > > wrote: > > > > > >> > > > >> > If you changed this file, you would need to rerun > > > > > >> initialize.sh in > > > > > >> > > > >> order to > > > > > >> > > > >> > register the connector. > > > > > >> > > > >> > > > > > > >> > > > >> > Karl > > > > > >> > > > >> > > > > > > >> > > > >> > > > > > > >> > > > >> > On Mon, Dec 15, 2014 at 8:42 AM, Kamil Żyta < > > > > > >> [email protected]> > > > > > >> > > > >> wrote: > > > > > >> > > > >> > > > > > > > >> > > > >> > > the same as connectors.xml: > > > > > >> > > > >> > > (...) > > > > > >> > > > >> > > <repositoryconnector name="Windows shares" > > > > > >> > > > >> > > > > > > > >> > > > >> > > > > > >> > > > > > > > >> > > > class="org.apache.manifoldcf.crawler.connectors.sharedrive.SharedDriveConnector"/> > > > > > >> > > > >> > > (...) > > > > > >> > > > >> > > > > > > > >> > > > >> > > K > > > > > >> > > > >> > > > > > > > >> > > > >> > > On Mon, Dec 15, 2014 at 08:39:07AM -0500, Karl Wright > > > wrote: > > > > > >> > > > >> > > > Hi Kamil, > > > > > >> > > > >> > > > > > > > > >> > > > >> > > > What does connectors-proprietary.xml say about the > > > jcifs > > > > > >> > > connector? > > > > > >> > > > >> > > > > > > > > >> > > > >> > > > Karl > > > > > >> > > > >> > > > > > > > > >> > > > >> > > > > > > > > >> > > > >> > > > On Mon, Dec 15, 2014 at 8:35 AM, Kamil Żyta < > > > > > >> > > [email protected]> > > > > > >> > > > >> > > wrote: > > > > > >> > > > >> > > > > > > > > > >> > > > >> > > > > Right, thx. Another problem: > > > > > >> > > > >> > > > > > > > > > > >> > > > >> > > > > > > > > > >> > > > >> > > > > > > > >> > > > >> > > > > > >> > > > > > > > >> > > > org.apache.manifoldcf.crawler.connectors.sharedrive.SharedDriveConnector(uninstalled) > > > > > >> > > > >> > > > > > > > > > >> > > > >> > > > > properties.xml: > > > > > >> > > > >> > > > > <libdir path="../connector-lib-proprietary"/> > > > > > >> > > > >> > > > > > > > > > >> > > > >> > > > > > cat ../connectors.xml > > > > > >> > > > >> > > > > <repositoryconnector name="Windows shares" > > > > > >> > > > >> > > > > > > > > > >> > > > >> > > > > > > > >> > > > >> > > > > > >> > > > > > > > >> > > > class="org.apache.manifoldcf.crawler.connectors.sharedrive.SharedDriveConnector"/> > > > > > >> > > > >> > > > > > > > > > >> > > > >> > > > > > ls ../connector-lib-proprietary > > > > > >> > > > >> > > > > jcifs.jar > > > > > >> > > > >> > > > > > > > > > >> > > > >> > > > > I think I checked/restarted everything. > > > > > >> > > > >> > > > > > > > > > >> > > > >> > > > > K > > > > > >> > > > >> > > > > > > > > > >> > > > >> > > > > On Mon, Dec 15, 2014 at 08:00:12AM -0500, Karl > > > Wright > > > > > >> wrote: > > > > > >> > > > >> > > > > > You have to run ./initialize.sh on the MCF 1.8 > > > > > >> codebase for > > > > > >> > > the > > > > > >> > > > >> > > upgrade > > > > > >> > > > >> > > > > to > > > > > >> > > > >> > > > > > take place. > > > > > >> > > > >> > > > > > > > > > > >> > > > >> > > > > > Karl > > > > > >> > > > >> > > > > > > > > > > >> > > > >> > > > > > > > > > > >> > > > >> > > > > > On Mon, Dec 15, 2014 at 7:43 AM, Kamil Żyta < > > > > > >> > > > >> [email protected]> > > > > > >> > > > >> > > > > wrote: > > > > > >> > > > >> > > > > > > > > > > > >> > > > >> > > > > > > With release-1.8-branch is the same problem. > > > > > >> > > > >> > > > > > > > > > > > >> > > > >> > > > > > > K > > > > > >> > > > >> > > > > > > > > > > > >> > > > >> > > > > > > On Mon, Dec 15, 2014 at 06:47:12AM -0500, Karl > > > Wright > > > > > >> > > wrote: > > > > > >> > > > >> > > > > > > > Hi Kamil, > > > > > >> > > > >> > > > > > > > > > > > > >> > > > >> > > > > > > > You cannot upgrade to trunk from 1.x. > > > > > >> > > > >> > > > > > > > > > > > > >> > > > >> > > > > > > > Try upgrading to > > > > > >> > > > >> > > > > > > > branches/release-1.8-branch. > > > > > >> > > > >> > > > > > > > > > > > > >> > > > >> > > > > > > > Karl > > > > > >> > > > >> > > > > > > > > > > > > >> > > > >> > > > > > > > > > > > > >> > > > >> > > > > > > > On Mon, Dec 15, 2014 at 3:39 AM, Kamil Żyta > > > > > >> > > > >> > > > > > > > < > > > > > >> > > > >> > > [email protected]> > > > > > >> > > > >> > > > > > > wrote: > > > > > >> > > > >> > > > > > > > > > > > > > >> > > > >> > > > > > > > > Hi, > > > > > >> > > > >> > > > > > > > > after upgrading to trunk I get 'Database > > > > > >> exception: > > > > > >> > > > >> > > SQLException > > > > > >> > > > >> > > > > doing > > > > > >> > > > >> > > > > > > > > query (42703): ERROR: column > > > "needpriority" does > > > > > >> not > > > > > >> > > > >> exist'. > > > > > >> > > > >> > > > > > > > > How can I upgrade db schema? I tried > > > > > >> ./initialize.sh > > > > > >> > > > >> without > > > > > >> > > > >> > > > > success. > > > > > >> > > > >> > > > > > > > > > > > > > >> > > > >> > > > > > > > > K > > > > > >> > > > >> > > > > > > > > > > > > > >> > > > >> > > > > > > > > On Fri, Dec 12, 2014 at 10:40:39AM -0500, > > > Karl > > > > > >> Wright > > > > > >> > > > >> wrote: > > > > > >> > > > >> > > > > > > > > > Ok, committed a fix. CONNECTORS-1121. > > > > > >> > > > >> > > > > > > > > > > > > > > >> > > > >> > > > > > > > > > Karl > > > > > >> > > > >> > > > > > > > > > > > > > > >> > > > >> > > > > > > > > > > > > > > >> > > > >> > > > > > > > > > On Fri, Dec 12, 2014 at 10:32 AM, Karl > > > Wright < > > > > > >> > > > >> > > > > [email protected]> > > > > > >> > > > >> > > > > > > > > wrote: > > > > > >> > > > >> > > > > > > > > > > > > > > > >> > > > >> > > > > > > > > > > Ah, thanks, this is due to changes I > > > made > > > > > >> > > yesterday. > > > > > >> > > > >> > > > > > > > > > > > > > > > >> > > > >> > > > > > > > > > > Hold on. > > > > > >> > > > >> > > > > > > > > > > Karl > > > > > >> > > > >> > > > > > > > > > > > > > > > >> > > > >> > > > > > > > > > > > > > > > >> > > > >> > > > > > > > > > > On Fri, Dec 12, 2014 at 10:12 AM, > > > Kamil Żyta > > > > > >> < > > > > > >> > > > >> > > > > > > [email protected]> > > > > > >> > > > >> > > > > > > > > > > wrote: > > > > > >> > > > >> > > > > > > > > > >> > > > > > >> > > > >> > > > > > > > > > >> On Fri, Dec 12, 2014 at 09:55:41AM > > > -0500, > > > > > >> Karl > > > > > >> > > Wright > > > > > >> > > > >> > > wrote: > > > > > >> > > > >> > > > > > > > > > >> > I've created CONNECTORS-1120 for > > > this > > > > > >> fix. I > > > > > >> > > > >> should > > > > > >> > > > >> > > have > > > > > >> > > > >> > > > > > > something > > > > > >> > > > >> > > > > > > > > to > > > > > >> > > > >> > > > > > > > > > >> try > > > > > >> > > > >> > > > > > > > > > >> > shortly. > > > > > >> > > > >> > > > > > > > > > >> > > > > > > >> > > > >> > > > > > > > > > >> > > > > > >> > > > >> > > > > > > > > > >> I can't build mcf from source: > > > > > >> > > > >> > > > > > > > > > >> BUILD FAILED > > > > > >> > > > >> > > > > > > > > > >> /opt/mcf-trunk/build.xml:1438: Can't > > > get > > > > > >> > > > >> > > > > > > > > > >> > > > > > >> > > > >> > > > > > > > > > > > > > >> > > > >> > > > > > > > > > > > >> > > > >> > > > > > > > > > >> > > > >> > > > > > > > >> > > > >> > > > > > >> > > > > > > > >> > > > https://www.apache.org/dist/manifoldcf/apache-manifoldcf-elasticsearch-plugin-2.0-bin.zip > > > > > >> > > > >> > > > > > > > > > >> to > > > > > >> > > > >> > > > > > > > > > >> > > > > > >> > > > >> > > > > > > > > > > > > > >> > > > >> > > > > > > > > > > > >> > > > >> > > > > > > > > > >> > > > >> > > > > > > > >> > > > >> > > > > > >> > > > > > > > >> > > > /opt/mcf-trunk/build/download/apache-manifoldcf-elasticsearch-plugin-bin.zip > > > > > >> > > > >> > > > > > > > > > >> > > > > > >> > > > >> > > > > > > > > > >> K > > > > > >> > > > >> > > > > > > > > > >> > > > > > >> > > > >> > > > > > > > > > > > > > > > >> > > > >> > > > > > > > > > > > > > >> > > > >> > > > > > > > > > > > >> > > > >> > > > > > > > > > >> > > > >> > > > > > > > >> > > > >> > > > > > >> > > > > > > > > > >> > > > > > > > >> > > > > > > > > >
