I was able to reproduce the exception here using your URL.  It is
indeed a bug in how it handles the 500 error.  I've checked in a fix,
and will be spinning a new RC just as soon as we resolve the Maven
issue.  That turns out to be much thornier - if you'd run -DskipITs or
-DskipTests instead it would work.

Karl

On Fri, Sep 28, 2012 at 7:31 AM, Erlend Garåsen <[email protected]> wrote:
>
> OK, I will give you a stack trace in the beginning of next week.
>
> I will start the crawler once more and check the results when I'm back and
> change my vote then if it is ok.
>
> Erlend
>
>
> On 28.09.12 13.26, Karl Wright wrote:
>>
>> "Meanwhile, the following is filling up my log:
>> FATAL 2012-09-28 11:42:32,112 (Worker thread '29') - Error tossed:
>> String index out of range: -1
>> java.lang.StringIndexOutOfBoundsException: String index out of range: -1"
>>
>> This is indeed a problem I agree we should fix, but in order to do
>> that I need a stack trace.  It is not clear at all that it is related
>> to the 500 error you described before, but it could be.  I will create
>> a ticket for it though.
>> Karl
>>
>> On Fri, Sep 28, 2012 at 5:49 AM, Erlend Garåsen <[email protected]>
>> wrote:
>>>
>>>
>>> I'm trying to start a crawl before I have to run to the airport. I just
>>> discovered that MCF recrawls the same host over and over again when it
>>> returns result code 500:
>>> 09-28-2012 11:40:11.024         fetch
>>> http://foreninger.uio.no/go/oslo_open_2012_no.php
>>>          500
>>>
>>> It's just not this document, but several others returning the same HTTP
>>> result code.
>>>
>>> Meanwhile, the following is filling up my log:
>>> FATAL 2012-09-28 11:42:32,112 (Worker thread '29') - Error tossed: String
>>> index out of range: -1
>>> java.lang.StringIndexOutOfBoundsException: String index out of range: -1
>>>
>>> I'm pretty sure they are related to each other.
>>>
>>> I will end this job before I leave because I'm afraid that MCF will try
>>> to
>>> fetch these documents over and over again during this weekend.
>>>
>>> Erlend
>>>
>>>
>>> On 28.09.12 09.58, Karl Wright wrote:
>>>>
>>>>
>>>> Please vote +1 to release ManifoldCF 1.0, RC5.  The release artifact
>>>> can be found at:
>>>>
>>>> http://people.apache.org/~kwright/apache-manifoldcf-1.0
>>>>
>>>> There is also an SVN tag at:
>>>>
>>>> https://svn.apache.org/repos/asf/manifoldcf/tags/release-1.0-RC5
>>>>
>>>> Fixes since RC4:
>>>>
>>>> CONNECTORS-545
>>>>
>>>> Fixes since RC3:
>>>>
>>>> CONNECTORS-544
>>>>
>>>
>>>
>>> --
>>> Erlend Garåsen
>>> Center for Information Technology Services
>>> University of Oslo
>>> P.O. Box 1086 Blindern, N-0317 OSLO, Norway
>>> Ph: (+47) 22840193, Fax: (+47) 22852970, Mobile: (+47) 91380968, VIP:
>>> 31050
>
>
>
> --
> Erlend Garåsen
> Center for Information Technology Services
> University of Oslo
> P.O. Box 1086 Blindern, N-0317 OSLO, Norway
> Ph: (+47) 22840193, Fax: (+47) 22852970, Mobile: (+47) 91380968, VIP: 31050

Reply via email to