Hi Erlend, This is expected behavior. ManifoldCF is designed to retry on errors that mean potential server problems for a period of time and then give up. The logic here is that 500 may well mean the server is down but will be rebooted or whatever. It retries every N minutes until M minutes have elapsed. For a 500 error, I believe it's every 5 minutes for either 6 or 12 hours.
Karl On Fri, Sep 28, 2012 at 5:49 AM, Erlend Garåsen <e.f.gara...@usit.uio.no> wrote: > > I'm trying to start a crawl before I have to run to the airport. I just > discovered that MCF recrawls the same host over and over again when it > returns result code 500: > 09-28-2012 11:40:11.024 fetch > http://foreninger.uio.no/go/oslo_open_2012_no.php > 500 > > It's just not this document, but several others returning the same HTTP > result code. > > Meanwhile, the following is filling up my log: > FATAL 2012-09-28 11:42:32,112 (Worker thread '29') - Error tossed: String > index out of range: -1 > java.lang.StringIndexOutOfBoundsException: String index out of range: -1 > > I'm pretty sure they are related to each other. > > I will end this job before I leave because I'm afraid that MCF will try to > fetch these documents over and over again during this weekend. > > Erlend > > > On 28.09.12 09.58, Karl Wright wrote: >> >> Please vote +1 to release ManifoldCF 1.0, RC5. The release artifact >> can be found at: >> >> http://people.apache.org/~kwright/apache-manifoldcf-1.0 >> >> There is also an SVN tag at: >> >> https://svn.apache.org/repos/asf/manifoldcf/tags/release-1.0-RC5 >> >> Fixes since RC4: >> >> CONNECTORS-545 >> >> Fixes since RC3: >> >> CONNECTORS-544 >> > > > -- > Erlend Garåsen > Center for Information Technology Services > University of Oslo > P.O. Box 1086 Blindern, N-0317 OSLO, Norway > Ph: (+47) 22840193, Fax: (+47) 22852970, Mobile: (+47) 91380968, VIP: 31050