Hi Sebastian,

On 14 September 2016 at 15:20, Sebastian Nagel <[email protected]>
wrote:

> Should have the same effect than indexing with -deleteGone.
> If you are using Nutch 1.12 also have a look at this bug which
> could be the reason for your problem:
>   https://issues.apache.org/jira/browse/NUTCH-2269
> Do you see similar errors in the logs?
>
>
2016-09-13 04:38:04,391 INFO  solr.SolrIndexWriter - Indexing 177 documents
2016-09-13 04:38:41,017 INFO  solr.SolrMappingReader - source: appKey dest:
appKey
2016-09-13 04:38:41,030 INFO  solr.SolrMappingReader - source: access dest:
access
2016-09-13 04:38:41,030 INFO  solr.SolrMappingReader - source: content
dest: content
2016-09-13 04:38:41,030 INFO  solr.SolrMappingReader - source: endtime
dest: endtime
2016-09-13 04:38:41,030 INFO  solr.SolrMappingReader - source: keywords
dest: keywords
2016-09-13 04:38:41,030 INFO  solr.SolrMappingReader - source: site dest:
site
2016-09-13 04:38:41,030 INFO  solr.SolrMappingReader - source: title dest:
title
2016-09-13 04:38:41,031 INFO  solr.SolrMappingReader - source: tstamp dest:
changed
2016-09-13 04:38:41,031 INFO  solr.SolrMappingReader - source: tstamp dest:
created
2016-09-13 04:38:41,031 INFO  solr.SolrMappingReader - source: siteHash
dest: siteHash
2016-09-13 04:38:41,031 INFO  solr.SolrMappingReader - source: uid dest: uid
2016-09-13 04:38:41,031 INFO  solr.SolrMappingReader - source: type dest:
type
2016-09-13 04:38:41,031 INFO  solr.SolrMappingReader - source: site dest:
nutchSite_stringS
2016-09-13 04:38:41,031 INFO  solr.SolrMappingReader - source: host dest:
nutchHost_stringS
2016-09-13 04:41:22,120 INFO  indexer.IndexingJob - Indexer: finished at
2016-09-13 04:41:22, elapsed: 00:03:34
2016-09-13 04:41:30,489 INFO  indexer.CleaningJob - CleaningJob: starting
at 2016-09-13 04:41:30
2016-09-13 04:41:32,047 WARN  util.NativeCodeLoader - Unable to load
native-hadoop library for your platform... using builtin-java classes where
applicable
2016-09-13 04:41:35,680 INFO  indexer.IndexWriters - Adding
org.apache.nutch.indexwriter.solr.SolrIndexWriter
2016-09-13 04:41:35,759 INFO  solr.SolrMappingReader - source: appKey dest:
appKey
2016-09-13 04:41:35,759 INFO  solr.SolrMappingReader - source: access dest:
access
2016-09-13 04:41:35,759 INFO  solr.SolrMappingReader - source: content
dest: content
2016-09-13 04:41:35,759 INFO  solr.SolrMappingReader - source: endtime
dest: endtime
2016-09-13 04:41:35,759 INFO  solr.SolrMappingReader - source: keywords
dest: keywords
2016-09-13 04:41:35,759 INFO  solr.SolrMappingReader - source: site dest:
site
2016-09-13 04:41:35,759 INFO  solr.SolrMappingReader - source: title dest:
title
2016-09-13 04:41:35,760 INFO  solr.SolrMappingReader - source: tstamp dest:
changed
2016-09-13 04:41:35,760 INFO  solr.SolrMappingReader - source: tstamp dest:
created
2016-09-13 04:41:35,760 INFO  solr.SolrMappingReader - source: siteHash
dest: siteHash
2016-09-13 04:41:35,760 INFO  solr.SolrMappingReader - source: uid dest: uid
2016-09-13 04:41:35,760 INFO  solr.SolrMappingReader - source: type dest:
type
2016-09-13 04:41:35,760 INFO  solr.SolrMappingReader - source: site dest:
nutchSite_stringS
2016-09-13 04:41:35,760 INFO  solr.SolrMappingReader - source: host dest:
nutchHost_stringS
2016-09-13 04:41:36,541 INFO  indexer.CleaningJob - CleaningJob: deleted a
total of 2 documents
2016-09-13 04:41:36,545 WARN  mapred.FileOutputCommitter - Output path is
null in cleanup
2016-09-13 04:41:37,313 INFO  indexer.CleaningJob - CleaningJob: finished
at 2016-09-13 04:41:37, elapsed: 00:00:06
2016-09-13 04:41:38,857 WARN  util.NativeCodeLoader - Unable to load
native-hadoop library for your platform... using builtin-java classes where
applicable

It claims to have deleted 2 documents, but there are plenty of 404 pages
still in the index.

I think it's quite an old version of Nutch. There is a
lib/apache-nutch-1.8.jar file :-)

-- 


Met vriendelijke groet,


Jigal van Hemert | Ontwikkelaar



Langesteijn 124
3342LG Hendrik-Ido-Ambacht

T. +31 (0)78 635 1200
F. +31 (0)848 34 9697
KvK. 23 09 28 65

[email protected]
www.alternet.nl


Disclaimer:
Dit bericht (inclusief eventuele bijlagen) kan vertrouwelijke informatie
bevatten. Als u niet de beoogde ontvanger bent van dit bericht, neem dan
direct per e-mail of telefoon contact op met de verzender en verwijder dit
bericht van uw systeem. Het is niet toegestaan de inhoud van dit bericht op
welke wijze dan ook te delen met derden of anderszins openbaar te maken
zonder schriftelijke toestemming van alterNET Internet BV. U wordt
geadviseerd altijd bijlagen te scannen op virussen. AlterNET kan op geen
enkele wijze verantwoordelijk worden gesteld voor geleden schade als gevolg
van virussen.

Alle eventueel genoemde prijzen S.E. & O., excl. 21% BTW, excl. reiskosten.
Op al onze prijsopgaven, offertes, overeenkomsten, en diensten zijn, met
uitzondering van alle andere voorwaarden, de Algemene Voorwaarden van
alterNET Internet B.V. van toepassing. Op al onze domeinregistraties en
hostingactiviteiten zijn tevens onze aanvullende hostingvoorwaarden van
toepassing. Dit bericht is uitsluitend bedoeld voor de geadresseerde. Aan
dit bericht kunnen geen rechten worden ontleend.

! Bedenk voordat je deze email uitprint, of dit werkelijk nodig is !

Reply via email to