Hey, I am using Scheduler to crawl html files.
It runs every minute. And it needs to crawl /content/foo.html If I use Apache commons HttpClient for GET /content/foo.html, I need to set up authentication (Basic Auth?). However, since all html pages that I want to crawl are served within Sling, is there an API that "resolves" (or "renders") paths like /content/foo.html, /content/bar.json ... etc? If I have to actually make HTTP request, where should I get authentication information? Scheduler does not know JCR Session.. Should I explicitly get ResourceResolver (using ResourceResolverFactory.getAdministrativeResourceResolver()) every time job is fired?
