Hi Kamil, Your email with attachment never made it through to the mailing list. It was probably caught by the spam filter. I've opened a ticket, CONNECTORS-1125. Can you attach your example to the the ticket?
Thanks! Karl On Thu, Dec 18, 2014 at 2:51 AM, Kamil Żyta <[email protected]> wrote: > > Hi, > any tips on how to solve the problem? > > K > > On Tue, Dec 16, 2014 at 05:06:33PM +0100, Kamil Żyta wrote: > > All *.7z files causes a problem. Example in attachment. > > > > K > > > > On Tue, Dec 16, 2014 at 10:20:24AM -0500, Karl Wright wrote: > > > Hi Kamil, > > > > > > If it happens again, see if you can find an archive file that it > happens > > > on. It's easy to do that: you just want to drop the file you suspect > down > > > in the file system somewhere, and set up a file system job to crawl > that > > > one file, making sure you send it through the Tika transformer of > course. > > > You can use a null output connection. > > > > > > If you can reproduce the problem with just that one file, then if you > send > > > me the file I can work with it here and determine whether the problem > is > > > local to your system or is a more general issue. > > > > > > Thanks, > > > Karl > > > > > > > > > On Tue, Dec 16, 2014 at 9:55 AM, Kamil Żyta <[email protected]> > wrote: > > > > > > > > err, only a few jobs causes a problem (the rest probably does not > have > > > > archives). > > > > I don't know which files you ask. > > > > > > > > K > > > > > > > > > > > > On Tue, Dec 16, 2014 at 03:15:16PM +0100, Kamil Żyta wrote: > > > > > Ok, I rebuilt mcf and the problem still was so I restart all jobs > and no > > > > problem. > > > > > Thx Karl for your time. > > > > > > > > > > K > > > > > > > > > > On Tue, Dec 16, 2014 at 07:28:34AM -0500, Karl Wright wrote: > > > > > > The commons-compress code just makes a simple reference to the > Coder > > > > class, > > > > > > with no reflection or anything suspicious going on. > > > > > > > > > > > > If you can send me the binary file that causes this issue, I can > verify > > > > > > whether it happens here or not. (If this seems to happen for > you on > > > > ALL > > > > > > files, then I can already assure you that it does not happen > here, and > > > > > > you've got something very special happening in your > environment.) If > > > > this > > > > > > is not reproducible here then you probably have a corrupt > > > > commons-compress > > > > > > jar and need to download it again. If it *is* reproducible, then > > > > probably > > > > > > we will need to create an Oracle Java bug ticket. > > > > > > > > > > > > Karl > > > > > > > > > > > > > > > > > > > > > > > > On Tue, Dec 16, 2014 at 7:17 AM, Karl Wright <[email protected] > > > > > > wrote: > > > > > > > > > > > > > > Ok, then I don't understand it. It may be a bug in > > > > commons-compress, or > > > > > > > maybe even a bad jar. I'll have a look at their code and see > if I > > > > can > > > > > > > figure out why that class won't load. > > > > > > > > > > > > > > Karl > > > > > > > > > > > > > > > > > > > > > On Tue, Dec 16, 2014 at 7:06 AM, Kamil Żyta < > [email protected]> > > > > wrote: > > > > > > >> > > > > > > >> I use multiprocess-zk-example, external pgsql db and splited > war. > > > > > > >> > > > > > > >> K > > > > > > >> > > > > > > >> On Tue, Dec 16, 2014 at 07:02:47AM -0500, Karl Wright wrote: > > > > > > >> > Hi Kamil, > > > > > > >> > > > > > > > >> > Which example are you using? is this with the combined > war, or > > > > is it > > > > > > >> one > > > > > > >> > of the multiprocess examples, or is it the single-process > quick > > > > start? > > > > > > >> > > > > > > > >> > I really don't have any idea why a class that IS found in a > > > > particular > > > > > > >> jar > > > > > > >> > cannot in turn find another class in the same jar, so I'll > need > > > > as many > > > > > > >> > details as possible. > > > > > > >> > > > > > > > >> > Karl > > > > > > >> > > > > > > > >> > > > > > > > >> > On Tue, Dec 16, 2014 at 6:54 AM, Kamil Żyta < > > > > [email protected]> > > > > > > >> wrote: > > > > > > >> > > > > > > > > >> > > > find . -iname 'commons-compress*' > > > > > > >> > > ./lib/commons-compress-1.8.1.jar > > > > > > >> > > ./dist/lib/commons-compress-1.8.1.jar > > > > > > >> > > ./framework/dist/lib/commons-compress-1.8.1.jar > > > > > > >> > > > > > > > > >> > > > > > > > > >> > > > > > ./framework/build/webapp/crawler-ui-proprietary/WEB-INF/lib/commons-compress-1.8.1.jar > > > > > > >> > > > > > > > > >> > > > > > > > > >> > > > > > ./framework/build/webapp/combined-service-proprietary/WEB-INF/lib/commons-compress-1.8.1.jar > > > > > > >> > > > > > > > > >> > > > > > > > > >> > > > > > ./framework/build/webapp/authority-service/WEB-INF/lib/commons-compress-1.8.1.jar > > > > > > >> > > > > > > > > >> > > > > > ./framework/build/webapp/crawler-ui/WEB-INF/lib/commons-compress-1.8.1.jar > > > > > > >> > > > > > > > > >> > > > > > ./framework/build/webapp/api-service/WEB-INF/lib/commons-compress-1.8.1.jar > > > > > > >> > > > > > > > > >> > > > > > > > > >> > > > > > ./framework/build/webapp/combined-service/WEB-INF/lib/commons-compress-1.8.1.jar > > > > > > >> > > > > > > > > >> > > > > > > > > >> > > > > > ./framework/build/webapp/authority-service-proprietary/WEB-INF/lib/commons-compress-1.8.1.jar > > > > > > >> > > > > > > > > >> > > > > > > > > >> > > > > > ./framework/build/webapp/api-service-proprietary/WEB-INF/lib/commons-compress-1.8.1.jar > > > > > > >> > > > > > > > > >> > > I follow > > > > > > >> > > > > > > > > >> > > > > > https://manifoldcf.apache.org/release/trunk/en_US/how-to-build-and-deploy.html#Building+the+framework+and+the+connectors+using+Apache+Ant > > > > > > >> > > There aren't anything about clean-core-deps. I checkout > fresh > > > > > > >> > > release-1.8-branch. > > > > > > >> > > > > > > > > >> > > K > > > > > > >> > > > > > > > > >> > > On Tue, Dec 16, 2014 at 06:38:18AM -0500, Karl Wright > wrote: > > > > > > >> > > > Hi Kamil, > > > > > > >> > > > > > > > > > >> > > > I've confirmed that this should not be a classloader > issue. > > > > The > > > > > > >> class in > > > > > > >> > > > question is in commons-compress.jar at the root level > (under > > > > > > >> dist/lib). > > > > > > >> > > > The only way this would not be loadable is if you had > TWO > > > > > > >> > > commons-compress > > > > > > >> > > > jars in your lib area. This is possible if you > upgraded to > > > > mcf 1.8 > > > > > > >> and > > > > > > >> > > did > > > > > > >> > > > not do a make clean-core-deps before you did a > make-core-deps, > > > > > > >> because > > > > > > >> > > now > > > > > > >> > > > all jars have versions attached to their names. > > > > > > >> > > > > > > > > > >> > > > Please confirm you do not have duplicate jars in this > > > > directory. > > > > > > >> > > > > > > > > > >> > > > Thanks, > > > > > > >> > > > Karl > > > > > > >> > > > > > > > > > >> > > > > > > > > > >> > > > On Tue, Dec 16, 2014 at 6:31 AM, Karl Wright < > > > > [email protected]> > > > > > > >> wrote: > > > > > > >> > > > > > > > > > > >> > > > > Hi Kamil, > > > > > > >> > > > > > > > > > > >> > > > > Your problem looks like a potential classloader > issue. Let > > > > me do > > > > > > >> some > > > > > > >> > > > > research and get back to you. > > > > > > >> > > > > > > > > > > >> > > > > Karl > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > On Tue, Dec 16, 2014 at 5:34 AM, Kamil Żyta < > > > > > > >> [email protected]> > > > > > > >> > > wrote: > > > > > > >> > > > >> > > > > > > >> > > > >> thx Karl but now I have new issue: > > > > > > >> > > > >> > > > > > > >> > > > >> FATAL 2014-12-16 11:12:58,496 (Worker thread '47') - > Error > > > > > > >> tossed: > > > > > > >> > > Could > > > > > > >> > > > >> not initialize class > > > > > > >> > > org.apache.commons.compress.archivers.sevenz.Coders > > > > > > >> > > > >> java.lang.NoClassDefFoundError: Could not initialize > class > > > > > > >> > > > >> org.apache.commons.compress.archivers.sevenz.Coders > > > > > > >> > > > >> at > > > > > > >> > > > >> > > > > > > >> > > > > > > > > >> > > > > > org.apache.commons.compress.archivers.sevenz.SevenZFile.readEncodedHeader(SevenZFile.java:279) > > > > > > >> > > > >> at > > > > > > >> > > > >> > > > > > > >> > > > > > > > > >> > > > > > org.apache.commons.compress.archivers.sevenz.SevenZFile.readHeaders(SevenZFile.java:191) > > > > > > >> > > > >> at > > > > > > >> > > > >> > > > > > > >> > > > > > > > > >> > > > > > org.apache.commons.compress.archivers.sevenz.SevenZFile.<init>(SevenZFile.java:95) > > > > > > >> > > > >> at > > > > > > >> > > > >> > > > > > > >> > > > > > > > > >> > > > > > org.apache.commons.compress.archivers.sevenz.SevenZFile.<init>(SevenZFile.java:117) > > > > > > >> > > > >> at > > > > > > >> > > > >> > > > > > > >> > > > > > org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:130) > > > > > > >> > > > >> at > > > > > > >> > > > >> > > > > > > >> > > > > > org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:244) > > > > > > >> > > > >> at > > > > > > >> > > > >> > > > > > > >> > > > > > org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:244) > > > > > > >> > > > >> at > > > > > > >> > > > >> > > > > > > >> > > > > > > > > >> > > > > > org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:121) > > > > > > >> > > > >> at > > > > > > >> > > > >> > > > > > > >> > > > > > > > > >> > > > > > org.apache.manifoldcf.agents.transformation.tika.TikaExtractor.addOrReplaceDocumentWithException(TikaExtractor.java:230) > > > > > > >> > > > >> at > > > > > > >> > > > >> > > > > > > >> > > > > > > > > >> > > > > > org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester$PipelineAddEntryPoint.addOrReplaceDocumentWithException(IncrementalIngester.java:3257) > > > > > > >> > > > >> at > > > > > > >> > > > >> > > > > > > >> > > > > > > > > >> > > > > > org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester$PipelineAddFanout.sendDocument(IncrementalIngester.java:3108) > > > > > > >> > > > >> at > > > > > > >> > > > >> > > > > > > >> > > > > > > > > >> > > > > > org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester$PipelineObjectWithVersions.addOrReplaceDocumentWithException(IncrementalIngester.java:2739) > > > > > > >> > > > >> at > > > > > > >> > > > >> > > > > > > >> > > > > > > > > >> > > > > > org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentIngest(IncrementalIngester.java:792) > > > > > > >> > > > >> at > > > > > > >> > > > >> > > > > > > >> > > > > > > > > >> > > > > > org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocumentWithException(WorkerThread.java:1610) > > > > > > >> > > > >> at > > > > > > >> > > > >> > > > > > > >> > > > > > > > > >> > > > > > org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocumentWithException(WorkerThread.java:1558) > > > > > > >> > > > >> at > > > > > > >> > > > >> > > > > > > >> > > > > > > > > >> > > > > > org.apache.manifoldcf.crawler.connectors.sharedrive.SharedDriveConnector.processDocuments(SharedDriveConnector.java:911) > > > > > > >> > > > >> at > > > > > > >> > > > >> > > > > > > >> > > > > > > > > >> > > > > > org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:383) > > > > > > >> > > > >> > > > > > > >> > > > >> And another question: I use Solr 4.10 with Tika 1.5. > MCF > > > > 1.8 > > > > > > >> have tika > > > > > > >> > > > >> 1.6. How this affect document parsing? > > > > > > >> > > > >> > > > > > > >> > > > >> K > > > > > > >> > > > >> > > > > > > >> > > > >> On Mon, Dec 15, 2014 at 08:45:31AM -0500, Karl Wright > > > > wrote: > > > > > > >> > > > >> > If you changed this file, you would need to rerun > > > > > > >> initialize.sh in > > > > > > >> > > > >> order to > > > > > > >> > > > >> > register the connector. > > > > > > >> > > > >> > > > > > > > >> > > > >> > Karl > > > > > > >> > > > >> > > > > > > > >> > > > >> > > > > > > > >> > > > >> > On Mon, Dec 15, 2014 at 8:42 AM, Kamil Żyta < > > > > > > >> [email protected]> > > > > > > >> > > > >> wrote: > > > > > > >> > > > >> > > > > > > > > >> > > > >> > > the same as connectors.xml: > > > > > > >> > > > >> > > (...) > > > > > > >> > > > >> > > <repositoryconnector name="Windows shares" > > > > > > >> > > > >> > > > > > > > > >> > > > >> > > > > > > >> > > > > > > > > >> > > > > > class="org.apache.manifoldcf.crawler.connectors.sharedrive.SharedDriveConnector"/> > > > > > > >> > > > >> > > (...) > > > > > > >> > > > >> > > > > > > > > >> > > > >> > > K > > > > > > >> > > > >> > > > > > > > > >> > > > >> > > On Mon, Dec 15, 2014 at 08:39:07AM -0500, Karl > Wright > > > > wrote: > > > > > > >> > > > >> > > > Hi Kamil, > > > > > > >> > > > >> > > > > > > > > > >> > > > >> > > > What does connectors-proprietary.xml say about > the > > > > jcifs > > > > > > >> > > connector? > > > > > > >> > > > >> > > > > > > > > > >> > > > >> > > > Karl > > > > > > >> > > > >> > > > > > > > > > >> > > > >> > > > > > > > > > >> > > > >> > > > On Mon, Dec 15, 2014 at 8:35 AM, Kamil Żyta < > > > > > > >> > > [email protected]> > > > > > > >> > > > >> > > wrote: > > > > > > >> > > > >> > > > > > > > > > > >> > > > >> > > > > Right, thx. Another problem: > > > > > > >> > > > >> > > > > > > > > > > > >> > > > >> > > > > > > > > > > >> > > > >> > > > > > > > > >> > > > >> > > > > > > >> > > > > > > > > >> > > > > > org.apache.manifoldcf.crawler.connectors.sharedrive.SharedDriveConnector(uninstalled) > > > > > > >> > > > >> > > > > > > > > > > >> > > > >> > > > > properties.xml: > > > > > > >> > > > >> > > > > <libdir path="../connector-lib-proprietary"/> > > > > > > >> > > > >> > > > > > > > > > > >> > > > >> > > > > > cat ../connectors.xml > > > > > > >> > > > >> > > > > <repositoryconnector name="Windows shares" > > > > > > >> > > > >> > > > > > > > > > > >> > > > >> > > > > > > > > >> > > > >> > > > > > > >> > > > > > > > > >> > > > > > class="org.apache.manifoldcf.crawler.connectors.sharedrive.SharedDriveConnector"/> > > > > > > >> > > > >> > > > > > > > > > > >> > > > >> > > > > > ls ../connector-lib-proprietary > > > > > > >> > > > >> > > > > jcifs.jar > > > > > > >> > > > >> > > > > > > > > > > >> > > > >> > > > > I think I checked/restarted everything. > > > > > > >> > > > >> > > > > > > > > > > >> > > > >> > > > > K > > > > > > >> > > > >> > > > > > > > > > > >> > > > >> > > > > On Mon, Dec 15, 2014 at 08:00:12AM -0500, > Karl > > > > Wright > > > > > > >> wrote: > > > > > > >> > > > >> > > > > > You have to run ./initialize.sh on the MCF > 1.8 > > > > > > >> codebase for > > > > > > >> > > the > > > > > > >> > > > >> > > upgrade > > > > > > >> > > > >> > > > > to > > > > > > >> > > > >> > > > > > take place. > > > > > > >> > > > >> > > > > > > > > > > > >> > > > >> > > > > > Karl > > > > > > >> > > > >> > > > > > > > > > > > >> > > > >> > > > > > > > > > > > >> > > > >> > > > > > On Mon, Dec 15, 2014 at 7:43 AM, Kamil > Żyta < > > > > > > >> > > > >> [email protected]> > > > > > > >> > > > >> > > > > wrote: > > > > > > >> > > > >> > > > > > > > > > > > > >> > > > >> > > > > > > With release-1.8-branch is the same > problem. > > > > > > >> > > > >> > > > > > > > > > > > > >> > > > >> > > > > > > K > > > > > > >> > > > >> > > > > > > > > > > > > >> > > > >> > > > > > > On Mon, Dec 15, 2014 at 06:47:12AM > -0500, Karl > > > > Wright > > > > > > >> > > wrote: > > > > > > >> > > > >> > > > > > > > Hi Kamil, > > > > > > >> > > > >> > > > > > > > > > > > > > >> > > > >> > > > > > > > You cannot upgrade to trunk from 1.x. > > > > > > >> > > > >> > > > > > > > > > > > > > >> > > > >> > > > > > > > Try upgrading to > branches/release-1.8-branch. > > > > > > >> > > > >> > > > > > > > > > > > > > >> > > > >> > > > > > > > Karl > > > > > > >> > > > >> > > > > > > > > > > > > > >> > > > >> > > > > > > > > > > > > > >> > > > >> > > > > > > > On Mon, Dec 15, 2014 at 3:39 AM, Kamil > Żyta < > > > > > > >> > > > >> > > [email protected]> > > > > > > >> > > > >> > > > > > > wrote: > > > > > > >> > > > >> > > > > > > > > > > > > > > >> > > > >> > > > > > > > > Hi, > > > > > > >> > > > >> > > > > > > > > after upgrading to trunk I get > 'Database > > > > > > >> exception: > > > > > > >> > > > >> > > SQLException > > > > > > >> > > > >> > > > > doing > > > > > > >> > > > >> > > > > > > > > query (42703): ERROR: column > > > > "needpriority" does > > > > > > >> not > > > > > > >> > > > >> exist'. > > > > > > >> > > > >> > > > > > > > > How can I upgrade db schema? I tried > > > > > > >> ./initialize.sh > > > > > > >> > > > >> without > > > > > > >> > > > >> > > > > success. > > > > > > >> > > > >> > > > > > > > > > > > > > > >> > > > >> > > > > > > > > K > > > > > > >> > > > >> > > > > > > > > > > > > > > >> > > > >> > > > > > > > > On Fri, Dec 12, 2014 at 10:40:39AM > -0500, > > > > Karl > > > > > > >> Wright > > > > > > >> > > > >> wrote: > > > > > > >> > > > >> > > > > > > > > > Ok, committed a fix. > CONNECTORS-1121. > > > > > > >> > > > >> > > > > > > > > > > > > > > > >> > > > >> > > > > > > > > > Karl > > > > > > >> > > > >> > > > > > > > > > > > > > > > >> > > > >> > > > > > > > > > > > > > > > >> > > > >> > > > > > > > > > On Fri, Dec 12, 2014 at 10:32 AM, > Karl > > > > Wright < > > > > > > >> > > > >> > > > > [email protected]> > > > > > > >> > > > >> > > > > > > > > wrote: > > > > > > >> > > > >> > > > > > > > > > > > > > > > > >> > > > >> > > > > > > > > > > Ah, thanks, this is due to > changes I > > > > made > > > > > > >> > > yesterday. > > > > > > >> > > > >> > > > > > > > > > > > > > > > > >> > > > >> > > > > > > > > > > Hold on. > > > > > > >> > > > >> > > > > > > > > > > Karl > > > > > > >> > > > >> > > > > > > > > > > > > > > > > >> > > > >> > > > > > > > > > > > > > > > > >> > > > >> > > > > > > > > > > On Fri, Dec 12, 2014 at 10:12 AM, > > > > Kamil Żyta > > > > > > >> < > > > > > > >> > > > >> > > > > > > [email protected]> > > > > > > >> > > > >> > > > > > > > > > > wrote: > > > > > > >> > > > >> > > > > > > > > > >> > > > > > > >> > > > >> > > > > > > > > > >> On Fri, Dec 12, 2014 at > 09:55:41AM > > > > -0500, > > > > > > >> Karl > > > > > > >> > > Wright > > > > > > >> > > > >> > > wrote: > > > > > > >> > > > >> > > > > > > > > > >> > I've created CONNECTORS-1120 > for > > > > this > > > > > > >> fix. I > > > > > > >> > > > >> should > > > > > > >> > > > >> > > have > > > > > > >> > > > >> > > > > > > something > > > > > > >> > > > >> > > > > > > > > to > > > > > > >> > > > >> > > > > > > > > > >> try > > > > > > >> > > > >> > > > > > > > > > >> > shortly. > > > > > > >> > > > >> > > > > > > > > > >> > > > > > > > >> > > > >> > > > > > > > > > >> > > > > > > >> > > > >> > > > > > > > > > >> I can't build mcf from source: > > > > > > >> > > > >> > > > > > > > > > >> BUILD FAILED > > > > > > >> > > > >> > > > > > > > > > >> /opt/mcf-trunk/build.xml:1438: > Can't > > > > get > > > > > > >> > > > >> > > > > > > > > > >> > > > > > > >> > > > >> > > > > > > > > > > > > > > >> > > > >> > > > > > > > > > > > > >> > > > >> > > > > > > > > > > >> > > > >> > > > > > > > > >> > > > >> > > > > > > >> > > > > > > > > >> > > > > > https://www.apache.org/dist/manifoldcf/apache-manifoldcf-elasticsearch-plugin-2.0-bin.zip > > > > > > >> > > > >> > > > > > > > > > >> to > > > > > > >> > > > >> > > > > > > > > > >> > > > > > > >> > > > >> > > > > > > > > > > > > > > >> > > > >> > > > > > > > > > > > > >> > > > >> > > > > > > > > > > >> > > > >> > > > > > > > > >> > > > >> > > > > > > >> > > > > > > > > >> > > > > > /opt/mcf-trunk/build/download/apache-manifoldcf-elasticsearch-plugin-bin.zip > > > > > > >> > > > >> > > > > > > > > > >> > > > > > > >> > > > >> > > > > > > > > > >> K > > > > > > >> > > > >> > > > > > > > > > >> > > > > > > >> > > > >> > > > > > > > > > > > > > > > > >> > > > >> > > > > > > > > > > > > > > >> > > > >> > > > > > > > > > > > > >> > > > >> > > > > > > > > > > >> > > > >> > > > > > > > > >> > > > >> > > > > > > >> > > > > > > > > > > >> > > > > > > > > >> > > > > > > > > > > > > > >
