great idea!!! Do you mean I could index with protocol file? I am confused if nutch is a crawler or an indexer as solr is an indexer but indexes as well
Le dim. 12 janv. 2025 à 19:13, Sebastian Nagel <[email protected]> a écrit : > Hi, > > assumed you have your Git Web server running [1] - it just means crawling > all > URLs on this server. Cloning repositories cannot be done by Nutch because > it's > not done by sending a HTTP GET request. > > Alternatively, you might "crawl" the local filesystem containing the cloned > repositories. Nutch has a protocol implementation "protocol-file" for this > task. > > Best, > Sebastian > > [1] https://git-scm.com/book/en/v2/Git-on-the-Server-GitWeb > > On 1/10/25 07:07, anon anon wrote: > > Hello, > > > > I want to clone and index repo with a nutcjh config. > > > > Do you know how to have a such config please? > > > > Best regards! > >

