Be able to modify URL rules while crawler is running
----------------------------------------------------

                 Key: DROIDS-77
                 URL: https://issues.apache.org/jira/browse/DROIDS-77
             Project: Droids
          Issue Type: New Feature
          Components: core
    Affects Versions: 0.01
            Reporter: Richard Frovarp
            Priority: Minor


It would be nice to be able to modify the URL rules while a crawler is running. 
This would allow me to dynamically exclude areas from being crawled based on 
results being returned. Basically I want to look for certain markers inside a 
page, then not crawl those pages without having update a robots file. Different 
paths of our site is going to enter into the index from a different method than 
the main crawl, so I can skip them once I find them. 

Having a modifiable filter would allow people to load their rules from places 
other than a file without having to write their own implementation or 
extension. I'll try to work up a patch sometime this week.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to