Document deletion works perfectly after I reinstalled the SSL
certificate and reentered the username and password to our Solr server.
So I think this issue has been solved.
Erlend
On 27.04.12 12.11, Erlend Garåsen wrote:
Many thanks for your suggestions and help, Karl. Using a filesystem
crawl was actually a good idea for debugging/testing. To install a new
version of Solr is not that easy on our test server for many reasons,
generally because it is under control of another division dealing with
servers at the uni, even though I can get root access. Anyway, according
to the logs on our Solr 3.2 server, it seems that MCF successfully
managed to delete one test document I removed:
[2012-04-27 11:18:33.092] {delete=[file:/tmp/mcf/docs/app_lasso.pdf]} 0 7
[2012-04-27 11:18:33.092] [] webapp=/solr path=/update params={}
status=0 QTime=7
The result code is 200 according to Simple History in MCF.
I entered the passwords once again for the Solr servers into the Solr
output configuration, deleted and uploaded our SSL certificate once
again before I did the filesystem test. I should have performed the
tests prior to the password updates.
The crawl will start again later today at 6 pm on our production server,
so I will try to figure out whether we still have problems later. I'm
going to Scotland later this evening for some days without my laptop, so
I cannot check the status of my crawl before I'm back, but I'll let my
colleague watch the logs.
Erlend
On 26.04.12 21.14, Karl Wright wrote:
Hi Erlend,
I had some time today and was able to verify that everything worked
fine against what I have currently on my laptop, which is Solr 3.2.
The second job run looks like this:
04-26-2012 15:11:44.154 job end 1335467343879(test) 0 1
04-26-2012 15:11:34.159 document deletion (solr)
file:/C:/testcrawl/there.txt 200 0 117
04-26-2012 15:11:24.690 read document C:\testcrawl OK 0 1
04-26-2012 15:11:24.494 job start 1335467343879(test) 0 1
So it appears that either something changed in Solr, or SSL support is
broken, or your network is not permitting a valid HTTP response for
some reason.
Karl
On Thu, Apr 26, 2012 at 11:10 AM, Karl Wright<daddy...@gmail.com> wrote:
Hi Erlend,
Can you try the following:
(1) Make a fresh Solr checkout of 3.6 or whatever Solr version you are
using, and build it
(2) Start it
(3) Run a simple filesystem crawl using a Solr connection that is
created with the default values
(4) Delete a file in your filesystem that was crawled
(5) Crawl again
Does the deletion happen OK?
AFAIK, nothing has changed in the Solr connector that should affect
the ability to delete. This test will confirm that it is still
working.
Thanks,
Karl
On Thu, Apr 26, 2012 at 10:19 AM, Erlend Garåsen
<e.f.gara...@usit.uio.no> wrote:
It seems that MCF cannot delete documents from Solr. A timeout
occurs, and
the job stops after a while.
This is what I can see from the log:
WARN 2012-04-20 18:24:30,373 (Worker thread '16') - Service
interruption
reported for job 1327930125433 connection 'Web crawler': Ingestion API
socket timeout exception waiting for response code: Read timed out;
ingestion will be retried again later
If I take a further look in Simple History, it seems that this error is
related to document deletion.
I have tried to delete the document manually by using curl from the
same
server MCF is installed on in case we have some access restrictions,
but
Curr succeeded.
We do not have any problems with adding, the timeout only occurs while
deleting documents.
I have checked our Solr configuration. MCF does use the correct path
for
document deletion, i.e. /update.
The correct realm, username and password for our Solr server are
entered
correctly and the SSL certificate is valid as well.
Erlend
--
Erlend Garåsen
Center for Information Technology Services
University of Oslo
P.O. Box 1086 Blindern, N-0317 OSLO, Norway
Ph: (+47) 22840193, Fax: (+47) 22852970, Mobile: (+47) 91380968,
VIP: 31050
--
Erlend Garåsen
Center for Information Technology Services
University of Oslo
P.O. Box 1086 Blindern, N-0317 OSLO, Norway
Ph: (+47) 22840193, Fax: (+47) 22852970, Mobile: (+47) 91380968, VIP: 31050