Hi, I'm sorry, but it doen't work completely yet. I'm not sure if you're still working on it - in case you don't, here are actions where I recognized some probs:
>From lcf-crawler-ui: List Repository Connections -> [Web Connector] -> View => Empty screen List Repository Connections -> [Web Connector] -> Edit => Exception List Repository Connections -> [Web Connector] -> Create => Exception when using button "Continue" Problems when creating a new job using a Web Connector. I don't want to rush you... just in case you didn't notice. Carina > > -------- Original-Nachricht -------- > Datum: Mon, 26 Jul 2010 10:26:10 +0200 > Von: [email protected] > An: [email protected] > Betreff: RE: RE: Beginner's question > > Same problem, different place. > Fix checked in, and I've audited the webconnector code for this issue > more thoroughly, so you should not see it there again. I'll be auditing > other connectors as well. > Karl > > ________________________________________ > From: ext [email protected] [[email protected]] > Sent: Monday, July 26, 2010 1:45 AM > To: [email protected] > Subject: Re: RE: Beginner's question > > Hi, > > thanks a lot for fixing it. :) > When starting the job I receive a NPE in the lcf-logfiles. > ----------------------------------- > [Startup thread] FATAL org.apache.lcf.crawlerthreads - Error tossed: null > java.lang.NullPointerException > at java.io.StringReader.<init>(StringReader.java:33) > at > org.apache.lcf.crawler.connectors.webcrawler.WebcrawlerConnector.stringToArray(WebcrawlerConnector.java:6681) > at > org.apache.lcf.crawler.connectors.webcrawler.WebcrawlerConnector$DocumentURLFilter.<init>(WebcrawlerConnector.java:7158) > at > org.apache.lcf.crawler.connectors.webcrawler.WebcrawlerConnector.addSeedDocuments(WebcrawlerConnector.java:460) > at > org.apache.lcf.crawler.connectors.BaseRepositoryConnector.addSeedDocuments(BaseRepositoryConnector.java:243) > at > org.apache.lcf.crawler.system.StartupThread.run(StartupThread.java:184) > ----------------------------------- > The seed I entered was sth like "www.apache.org" or > "http://www.apache.org". > > And some minor probs: after creating a webcrawler job and clicking "View" > in the "List all Jobs" tab or "Save" after having selected the "Edit" > dialog I still receive an empty screen. > > Carina > > -------- Original-Nachricht -------- > Datum: Fri, 23 Jul 2010 14:52:23 +0200 > Von: [email protected] > An: [email protected] > Betreff: RE: Beginner's question > > Done. r967081. > > > Karl > > > > > From: Wright Karl (Nokia-MS/Cambridge) > Sent: Friday, July 23, 2010 8:39 AM > To: > [email protected]<http://service.gmx.net/de/cgi/g.fcgi/mail/new?CUSTOMERNO=23657247&t=de598590807.1280123044.33016714&to=connectors-user%40incubator.apache.org> > Subject: RE: Beginner's question > > > > > > > > It appears that work done for the API inadvertently broke the web > connector UI. I’ll check a fix shortly. > > > > > Karl > > > > > From: Wright Karl (Nokia-MS/Cambridge) > Sent: Friday, July 23, 2010 8:32 AM > To: > [email protected]<http://service.gmx.net/de/cgi/g.fcgi/mail/new?CUSTOMERNO=23657247&t=de598590807.1280123044.33016714&to=connectors-user%40incubator.apache.org> > Subject: RE: Beginner's question > > > > > > > > Your configuration looks reasonable. Do you see any stack traces in > either the LCF log, or the tomcat log? > > > > > I’ll try the same thing here and see what happens. > > > > > Karl > > > > > > > From: ext > [email protected]<http://service.gmx.net/de/cgi/g.fcgi/mail/new?CUSTOMERNO=23657247&t=de598590807.1280123044.33016714&to=c.a.r.e%40gmx.de> > > [mailto:[email protected]]<http://service.gmx.net/de/cgi/g.fcgi/mail/new?CUSTOMERNO=23657247&t=de598590807.1280123044.33016714&to=c.a.r.e%40gmx.de> > Sent: Friday, July 23, 2010 8:27 AM > To: > [email protected]<http://service.gmx.net/de/cgi/g.fcgi/mail/new?CUSTOMERNO=23657247&t=de598590807.1280123044.33016714&to=connectors-user%40incubator.apache.org> > Subject: Re: Beginner's question > > > > > > > > Hi, > > > > > > > > > > I'm still having the problem I explained below: > > > > > When I create a new job choosing a web connector I receive an empty > screen when clicking on one of the other tabs (Scheduling etc.). > > > > > When selecting a Filesys Connector everything works fine. > > > > > > > > > > I think I might have an error in my web connector configuration. > > > > > > > > > > > > > > > Name:Web Con Description: > > > > > > ________________________________ > Connection type:Web Connector Max connections:10 Authority:None (global > authority) > > > > > > ________________________________ > Throttling: > > > > > Bin regular expression > > > > > > Description > > > > > > Max avg fetches/min > > > > > > No throttles > > > > > > > ________________________________ > Email address: > > > > > > > [email protected]<http://service.gmx.net/de/cgi/g.fcgi/mail/new?CUSTOMERNO=23657247&t=de598590807.1280123044.33016714&to=mail%40example.org> > > > > > > Robots usage: > > > > > > Obey robots.txt for all fetches > > > > > > Bandwidth throttling: > > > > > > Bin regular expression > > > > > > Case insensitive? > > > > > > Max connections > > > > > > Max kbytes/sec > > > > > > Max fetches/min > > > > > > No bandwidth throttling > > > > > > > Page access credentials: > > > > > > URL regular expression > > > > > > Credential type > > > > > > Credential domain > > > > > > User name > > > > > > No page access credentials > > > > > > > Session-based access credentials: > > > > > > URL regular expression > > > > > > Login pages > > > > > > No session-based access credentials > > > > > > > Trust certificates: > > > > > > URL regular expression > > > > > > Certificate > > > > > > No trust certificates > > > > > > > > ________________________________ > Connection status:Connection working > > Any ideas? > Carina > > > > > > -------- Original-Nachricht -------- > Datum: Wed, 21 Jul 2010 16:04:10 +0200 > Von: Marc Emery > <[email protected]><http://service.gmx.net/de/cgi/g.fcgi/mail/new?CUSTOMERNO=23657247&t=de598590807.1280123044.33016714&to=%26lt%3Bmarco.emery%40gmail.com> > An: > [email protected]<http://service.gmx.net/de/cgi/g.fcgi/mail/new?CUSTOMERNO=23657247&t=de598590807.1280123044.33016714&to=connectors-user%40incubator.apache.org> > Betreff: Re: Beginner's question > > > > > > Hi, > It works, thanks a lot. > > Cheers > > > > > 2010/7/21 > <[email protected]<http://service.gmx.net/de/cgi/g.fcgi/mail/new?CUSTOMERNO=23657247&t=de502027959.1279887901.d40f4f4&to=karl.wright%40nokia.com>> > > > > > Code has just been checked in which fixes this subtle but nasty bug. > > Let me know what happens now. ;-) > Karl > > > > > > > -----Original Message----- > From: Wright Karl (Nokia-MS/Cambridge) > Sent: Wednesday, July 21, 2010 8:50 AM > To: > [email protected]<http://service.gmx.net/de/cgi/g.fcgi/mail/new?CUSTOMERNO=23657247&t=de502027959.1279887901.d40f4f4&to=connectors-user%40incubator.apache.org> > > > > > > > > > > Subject: RE: Beginner's question > > Well, that explains why your test isn't succeeding. > > I think I've found the cause of the problem, however. It is *indeed* the > language default used by Derby. The following code is the problem: > > >>>>>> > protected LCFException reinterpretException(LCFException theException) > { > if (Logging.db.isDebugEnabled()) > Logging.db.debug("Reinterpreting exception > '"+theException.getMessage()+"'. The exception type is > "+Integer.toString(theException.getErrorCode())); > if (theException.getErrorCode() != > LCFException.DATABASE_CONNECTION_ERROR) > return theException; > Throwable e = theException.getCause(); > if (!(e instanceof java.sql.SQLException)) > return theException; > if (Logging.db.isDebugEnabled()) > Logging.db.debug("Exception "+theException.getMessage()+" is > possibly a transaction abort signal"); > String message = e.getMessage(); > if (message.indexOf("due to a deadlock") != -1) > return new > LCFException(message,e,LCFException.DATABASE_TRANSACTION_ABORT); > // Note well: We also have to treat 'duplicate key' as a transaction > abort, since this is what you get when two threads attempt to > // insert the same row. (Everything only works, then, as long as > there is a unique constraint corresponding to every bad insert that > // one could make.) > if (message.indexOf("duplicate key") != -1) > return new > LCFException(message,e,LCFException.DATABASE_TRANSACTION_ABORT); > if (Logging.db.isDebugEnabled()) > Logging.db.debug("Exception "+theException.getMessage()+" is NOT a > transaction abort signal"); > return theException; > } > <<<<<< > > It looks like Derby has a specific exception class instead for these > kinds of exceptions, so I will be able to test them directly rather than > look at text. Stay tuned. > > Karl > > > > > -----Original Message----- > From: ext > [email protected]<http://service.gmx.net/de/cgi/g.fcgi/mail/new?CUSTOMERNO=23657247&t=de502027959.1279887901.d40f4f4&to=c.a.r.e%40gmx.de> > > [mailto:[email protected]<http://service.gmx.net/de/cgi/g.fcgi/mail/new?CUSTOMERNO=23657247&t=de502027959.1279887901.d40f4f4&to=c.a.r.e%40gmx.de>] > Sent: Wednesday, July 21, 2010 8:25 AM > To: > [email protected]<http://service.gmx.net/de/cgi/g.fcgi/mail/new?CUSTOMERNO=23657247&t=de502027959.1279887901.d40f4f4&to=connectors-user%40incubator.apache.org> > Subject: Re: Beginner's question > > Hi, > > I'm getting the same exception as Marc except that on my machine it's > German text ;o) > I tried it first with jdk 1.6_13, then updated to 1.6_21 based on a new > SVN Update. But I haven't been successful yet. > > Carina > > > -------- Original-Nachricht -------- > > Datum: Wed, 21 Jul 2010 12:13:22 +0200 > > Von: > [email protected]<http://service.gmx.net/de/cgi/g.fcgi/mail/new?CUSTOMERNO=23657247&t=de502027959.1279887901.d40f4f4&to=karl.wright%40nokia.com> > > An: > [email protected]<http://service.gmx.net/de/cgi/g.fcgi/mail/new?CUSTOMERNO=23657247&t=de502027959.1279887901.d40f4f4&to=connectors-user%40incubator.apache.org> > > Betreff: Re: Beginner\'s question > > > I'm definitely not seeing this behavior here, with sun jdk 1.6. It's > > worth getting to the bottom of. > > > > Can you do the following: > > > > (1) Svn co a completely fresh version of LCF > > (2) Ant, making sure ant is actually using jdk 1.6 > > > > If you *still* get this problem, please let me know. It's not clear > what > > the difference is, but there's got to be a difference somewhere. I > hope it > > is not how Derby works on French machines. ;-) > > > > Karl > > > > > > >>>>>> > > Worker thread aborting and restarting due to database connection reset: > > Database exception: Exception doing query: L'instruction a été > abandonnée > > parce qu'elle aurait entraîné la duplication d'une valeur de clé > dans > > une contrainte de clé ou d'index unique identifié par > 'I1279701064805' > > définie sur 'INGESTSTATUS'. > > org.apache.lcf.core.interfaces.LCFException: Database exception: > Exception > > doing query: L'instruction a été abandonnée parce qu'elle aurait > > entraîné la duplication d'une valeur de clé dans une contrainte de > clé ou > > d'index unique identifié par 'I1279701064805' définie sur > 'INGESTSTATUS'. > > at > > > org.apache.lcf.core.database.Database.executeViaThread(Database.java:421) > > at > > > org.apache.lcf.core.database.Database.executeUncachedQuery(Database.java:449) > > at > > > org.apache.lcf.core.database.Database$QueryCacheExecutor.create(Database.java:1072) > > at > > > org.apache.lcf.core.cachemanager.CacheManager.findObjectsAndExecute(CacheManager.java:144) > > at > > org.apache.lcf.core.database.Database.executeQuery(Database.java:167) > > at > > > org.apache.lcf.core.database.DBInterfaceDerby.performModification(DBInterfaceDerby.java:615) > > at > > > org.apache.lcf.core.database.DBInterfaceDerby.performInsert(DBInterfaceDerby.java:177) > > at > > org.apache.lcf.core.database.BaseTable.performInsert(BaseTable.java:76) > > at > > > org.apache.lcf.agents.incrementalingest.IncrementalIngester.noteDocumentIngest(IncrementalIngester.java:1267) > > at > > > org.apache.lcf.agents.incrementalingest.IncrementalIngester.performIngestion(IncrementalIngester.java:410) > > at > > > org.apache.lcf.agents.incrementalingest.IncrementalIngester.documentIngest(IncrementalIngester.java:304) > > at > > > org.apache.lcf.crawler.system.WorkerThread$ProcessActivity.ingestDocument(WorkerThread.java:1586) > > at > > > org.apache.lcf.crawler.connectors.filesystem.FileConnector.processDocuments(FileConnector.java:275) > > at > > > org.apache.lcf.crawler.connectors.BaseRepositoryConnector.processDocuments(BaseRepositoryConnector.java:516) > > at > > org.apache.lcf.crawler.system.WorkerThread.run(WorkerThread.java:585) > > Caused by: java.sql.SQLIntegrityConstraintViolationException: > > L'instruction a été abandonnée parce qu'elle aurait entraîné la > duplication d'une > > valeur de clé dans une contrainte de clé ou d'index unique identifié > par > > 'I1279701064805' définie sur 'INGESTSTATUS'. > > at > > > org.apache.derby.impl.jdbc.SQLExceptionFactory40.getSQLException(Unknown > Source) > > at org.apache.derby.impl.jdbc.Util.generateCsSQLException(Unknown > > Source) > > at > > > org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(Unknown > Source) > > at > > > org.apache.derby.impl.jdbc.TransactionResourceImpl.handleException(Unknown > Source) > > at > org.apache.derby.impl.jdbc.EmbedConnection.handleException(Unknown > > Source) > > at > org.apache.derby.impl.jdbc.ConnectionChild.handleException(Unknown > > Source) > > at > org.apache.derby.impl.jdbc.EmbedStatement.executeStatement(Unknown > > Source) > > at > > > org.apache.derby.impl.jdbc.EmbedPreparedStatement.executeStatement(Unknown > Source) > > at > > org.apache.derby.impl.jdbc.EmbedPreparedStatement.executeUpdate(Unknown > Source) > > at org.apache.lcf.core.database.Database.execute(Database.java:566) > > at > > > org.apache.lcf.core.database.Database$ExecuteQueryThread.run(Database.java:381) > > Caused by: java.sql.SQLException: L'instruction a été abandonnée > parce > > qu'elle aurait entraîné la duplication d'une valeur de clé dans une > > contrainte de clé ou d'index unique identifié par 'I1279701064805' > définie > > sur 'INGESTSTATUS'. > > at > > org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(Unknown > Source) > > at > > > org.apache.derby.impl.jdbc.SQLExceptionFactory40.wrapArgsForTransportAcrossDRDA(Unknown > > Source) > > ... 11 more > > > > However i can start jetty and get the ui working. > > > > Thanks > > marc > > <<<<<< > > > > > > -- > GMX DSL: Internet-, Telefon- und Handy-Flat ab 19,99 EUR/mtl. > Bis zu 150 EUR Startguthaben inklusive! http://portal.gmx.net/de/go/dsl > > > > > > > > > > > > > -- > Neu: GMX De-Mail - Einfach wie E-Mail, sicher wie ein Brief! > Jetzt De-Mail-Adresse reservieren: http://portal.gmx.net/de/go/demail > > > > > > > > -- > GRATIS für alle GMX-Mitglieder: Die maxdome Movie-FLAT! > Jetzt freischalten unter http://portal.gmx.net/de/go/maxdome01 > -- Neu: GMX De-Mail - Einfach wie E-Mail, sicher wie ein Brief! Jetzt De-Mail-Adresse reservieren: http://portal.gmx.net/de/go/demail
