Hi Erlend, MCF caches the robots.txt file in the database, which it considers valid for 1 hour.
I'll look at the logs and thread dump and let you know if this is a locking issue or something else. Please stand by. Karl On Thu, Sep 18, 2014 at 5:24 AM, Erlend Garåsen <e.f.gara...@usit.uio.no> wrote: > > I tried to restart the job dealing with www.duo.no on our test server, > but it does not seem to touch the robots.txt file at all. That's the reason > why it's able to continue. Both servers are set up to obey the rules of > such files. > > Erlend > > > On 18.09.14 11:12, Erlend Garåsen wrote: > >> >> I'm facing the same problems with robot.txt files using RC1, so maybe >> this is another issue we have to fix. Can you please try to fetch the >> host below? For some odd reason, it seems that MCF on our test server >> can handle it. >> >> This is exactly the same that happened when I started MCF (referring to >> my previous post) after I had deployed RC1: >> 09-18-2014 11:02:14.400 robots parse https:www.duo.uio.no:443 >> ERRORS 0 3 Unknown robots.txt line: '====' >> >> No activity after this error. >> >> Here's the robots.txt file: >> https://www.duo.uio.no/robots.txt >> >> This is the content of manifoldcf.log after the startup: >> WARN 2014-09-18 11:02:14,401 (Worker thread '19') - Web: Unknown >> robots.txt line from 'https:www.duo.uio.no:443': '====' >> WARN 2014-09-18 11:02:14,401 (Worker thread '19') - Web: Unknown >> robots.txt line from 'https:www.duo.uio.no:443': ' The contents of >> this file are subject to the license and copyright' >> WARN 2014-09-18 11:02:14,402 (Worker thread '19') - Web: Unknown >> robots.txt line from 'https:www.duo.uio.no:443': ' detailed in the >> LICENSE and NOTICE files at the root of the source' >> WARN 2014-09-18 11:02:14,402 (Worker thread '19') - Web: Unknown >> robots.txt line from 'https:www.duo.uio.no:443': ' tree and available >> online at' >> WARN 2014-09-18 11:02:14,402 (Worker thread '19') - Web: Unknown >> robots.txt line from 'https:www.duo.uio.no:443': ' >> http://www.dspace.org/license/' >> WARN 2014-09-18 11:02:14,402 (Worker thread '19') - Web: Unknown >> robots.txt line from 'https:www.duo.uio.no:443': '====' >> >> E >> >> >> On 18.09.14 03:12, Karl Wright wrote: >> >>> Please vote on whether to release Apache ManifoldCF 1.7.1, RC1. >>> >>> This release fixes a number of critical issues, as well as a number of >>> user >>> priorities, most notably: >>> >>> - A bad Zookeeper support issue, which made locking support fail when >>> Zookeeper connections got lost and then restored; >>> - The Alfresco connector, which was nonfunctional in both MCF 1.6 and >>> 1.7; >>> - Solr Cloud support, which had ceased working due to changes to SolrJ; >>> - Non-null connector components caused failure; >>> - PostgreSQL queries not performing well. >>> >>> The complete list of included fixes can be found at: >>> >>> https://svn.apache.org/repos/asf/manifoldcf/tags/release-1. >>> 7.1-RC1/CHANGES.txt >>> >>> >>> The release candidate can be downloaded from: >>> >>> http://people.apache.org/~kwright/apache-manifoldcf-1.7.1 >>> >>> There is a tag at: >>> >>> https://svn.apache.org/repos/asf/manifoldcf/tags/release-1.7.1-RC1 >>> >>> Thanks, >>> Karl >>> >>> >> >