Hi,
assumed you have your Git Web server running [1] - it just means crawling all
URLs on this server. Cloning repositories cannot be done by Nutch because it's
not done by sending a HTTP GET request.
Alternatively, you might "crawl" the local filesystem containing the cloned
repositories. Nutch has a protocol implementation "protocol-file" for this task.
Best,
Sebastian
[1] https://git-scm.com/book/en/v2/Git-on-the-Server-GitWeb
On 1/10/25 07:07, anon anon wrote:
Hello,
I want to clone and index repo with a nutcjh config.
Do you know how to have a such config please?
Best regards!