Hello Hany, Sure, check these commands:
solrclean remove HTTP 301 and 404 documents from solr - DEPRECATED use the clean command instead clean remove HTTP 301 and 404 documents and duplicates from indexing backends configured via plugins Regards, Markus Op di 9 mrt. 2021 om 08:49 schreef Hany NASR <[email protected]>: > Hello Markus, > > I added the property in nutch-site.xml with no luck. > > The documents still exist in Solr; any advice? > > Regards, > Hany > > From: Markus Jelsma <[email protected]> > Sent: Monday, March 8, 2021 3:40 PM > To: [email protected] > Subject: EXTERNAL: Re: 301 perm redirect pages are still in Solr > > Hello Hany, > > You need to tell the indexer to delete those record. This will help: > > <!-- delete gone and redirects --> > <property> > <name>indexer.delete</name> > <value>true</value> > </property> > > Regards, > Markus > > Op ma 8 mrt. 2021 om 15:31 schreef Hany NASR <[email protected]<mailto: > [email protected]>.invalid>: > > > Hi All, > > > > I'm using Nutch 1.15, and figure out that permeant redirect pages (301) > > are still indexed and not removed in Solr. > > > > When I exported the crawlDB I found the page Status: 5 (db_redir_perm). > > > > How can I keep Solr index up to date and make Nutch clean these pages > > automatically? > > > > Regards, > > Hany > > > > ----------------------------------------- > > SAVE PAPER - THINK BEFORE YOU PRINT! > > > > This E-mail is confidential. > > > > It may also be legally privileged. If you are not the addressee you may > > not copy, > > forward, disclose or use any part of it. If you have received this > message > > in error, > > please delete it and all copies from your system and notify the sender > > immediately by > > return E-mail. > > > > Internet communications cannot be guaranteed to be timely secure, error > or > > virus-free. > > The sender does not accept liability for any errors or omissions. > > > > ****************************************************************** > This message originated from the Internet. Its originator may or > may not be who they claim to be and the information contained in > the message and any attachments may or may not be accurate. > ****************************************************************** > > ----------------------------------------- > SAVE PAPER - THINK BEFORE YOU PRINT! > > This E-mail is confidential. > > It may also be legally privileged. If you are not the addressee you may > not copy, > forward, disclose or use any part of it. If you have received this message > in error, > please delete it and all copies from your system and notify the sender > immediately by > return E-mail. > > Internet communications cannot be guaranteed to be timely secure, error or > virus-free. > The sender does not accept liability for any errors or omissions. >

