Release targets: * BUGFIX * DROIDS-74: LinkExtractor doesn't set the anchorText on the link.
* FEATURES * DROIDS-52: Optimize memory usage of TaskQueue and History. * TASK * DROIDS-13: Review stream handling in samples. * DROIDS-51: review MultiThreadedTaskMaster thread creation/looping. (Attached patch) * DROIDS-52: Optimize memory usage of TaskQueue and History. (Attached patch) * DROIDS-86: Empty files * DROIDS-87: Add missing license headers * CHANGES * DROIDS-11: The core parser OutgoingLinks should change. Implement a more generic TaskExtractor inteface that should return new task. ( http://markmail.org/message/vijohov4narssuv6) * DROIDS-35: Provide ability for large content entities to overflow to a temp file * DROIDS-56: Change the TaskQueue interface to java.util.Queue Discussion: * DROIDS-45: Fail to resolve outlink correctly. Mingfai Ma patches. Changes some base ideas. * DROIDS-48: Support prioritizing in the TaskQueue. Mingfai Ma patches. Changes some base ideas. * DROIDS-53: Implement a unique hash function for Task ID. * DROIDS-54: Make LinkTask supports arbitrary data by extends HashMap, and consider to refactor Task, Link, and LinkTask. Mingfai Ma patches. Changes some base ideas. * DROIDS-82: Use camel.apache.org for externals communications Future releases: Features: * DROIDS-27: add more functions to get basic status monitoring. * DROIDS-77: Be able to modify URL rules while crawler is running
Release targets: * BUGFIX * DROIDS-74: LinkExtractor doesn't set the anchorText on the link. * FEATURES * DROIDS-52: Optimize memory usage of TaskQueue and History. * TASK * DROIDS-13: Review stream handling in samples. * DROIDS-51: review MultiThreadedTaskMaster thread creation/looping. (Attached patch) * DROIDS-52: Optimize memory usage of TaskQueue and History. (Attached patch) * DROIDS-86: Empty files * DROIDS-87: Add missing license headers * CHANGES * DROIDS-11: The core parser OutgoingLinks should change. Implement a more generic TaskExtractor inteface that should return new task. (http://markmail.org/message/vijohov4narssuv6) * DROIDS-35: Provide ability for large content entities to overflow to a temp file * DROIDS-56: Change the TaskQueue interface to java.util.Queue Discussion: * DROIDS-45: Fail to resolve outlink correctly. Mingfai Ma patches. Changes some base ideas. * DROIDS-48: Support prioritizing in the TaskQueue. Mingfai Ma patches. Changes some base ideas. * DROIDS-53: Implement a unique hash function for Task ID. * DROIDS-54: Make LinkTask supports arbitrary data by extends HashMap, and consider to refactor Task, Link, and LinkTask. Mingfai Ma patches. Changes some base ideas. * DROIDS-82: Use camel.apache.org for externals communications Future releases: Features: * DROIDS-27: add more functions to get basic status monitoring. * DROIDS-77: Be able to modify URL rules while crawler is running
