Hi, The uploaded file
AI-Categorizer-0.04.tar.gz has entered CPAN as file: $CPAN/authors/id/K/KW/KWILLIAMS/AI-Categorizer-0.04.tar.gz size: 243180 bytes md5: e4b60cb505544ee2a8d8d02a36b06a0e Changes since 0.03: - Added learners for SVMs, Decision Trees, and a pass-through to Weka. - Added a virtual class for binary classifiers. - Wrote documentation for lots of the undocumented classes. - Added a PNG file giving an overview diagram of the classes. - Added a script 'categorizer' to provide a simple command-line interface to AI::Categorizer - save_state() and restore_state() now save to a directory, not a file. - Removed F1(), precision(), recall(), etc. from Util package since they're in Statistics::Contingency. Added random_elements() to Util. - Collection::Files now warns when no category information is known about a document in the collection (knowing it's in zero categories is okay). - Added the Collection::InMemory class - Much more thorough testing with 'make test'. - Added add_hypothesis() method to Experiment. - Added dot() and value() methods to FeatureVector. - Added 'feature_selection' parameter to KnowledgeSet. - Added document($name) accessor method to KnowledgeSet. - In KnowledgeSet, load(), read(), and scan_*() can now accept a Collection object. - Added document_frequency(), finish(), and weigh_features() methods to KnowledgeSet. - Added save_features() and restore_features() to KnowledgeSet. - Added default categories() and categorize() methods to Learner base class. get_scores() is now abstract. - Extended interface of ObjectSet class with retrieve(), includes(), and includes_name(). - Moved 'term_weighting' parameter from Document to KnowledgeSet, since the normalized version needs to know the maximum term-frequency. Also changed its values to 'n', 'l', 'b', and 't', with 'x' a synonym for 't'. - Implemented full range of TF/IDF term weighting methods (see Salton & Buckley, "Term Weighting Approaches in Automatic Text Retrieval", in journal "Information Processing & Management", 1988 #5) -Ken