Author: siren
Date: Tue Jul 11 08:50:53 2006
New Revision: 420902

URL: http://svn.apache.org/viewvc?rev=420902&view=rev
Log:
added some of missing changes

Modified:
    lucene/nutch/trunk/CHANGES.txt

Modified: lucene/nutch/trunk/CHANGES.txt
URL: 
http://svn.apache.org/viewvc/lucene/nutch/trunk/CHANGES.txt?rev=420902&r1=420901&r2=420902&view=diff
==============================================================================
--- lucene/nutch/trunk/CHANGES.txt (original)
+++ lucene/nutch/trunk/CHANGES.txt Tue Jul 11 08:50:53 2006
@@ -1,14 +1,200 @@
 Nutch Change Log
 
-Release 0.8
+Trunk (unreleased changes)
+
+ 0. Totally new architecture, based on hadoop
+    [http://lucene.apache.org/hadoop] (cutting)
 
  1. NUTCH-107 - Typo in plugin/urlfilter-*/plugin.xml. (Stephen Cross).
 
  2. NUTCH-108 - Log hosts that exceed generate.max.per.host.
-   (Rod Taylor via cutting)
+    (Rod Taylor via cutting)
+
+ 3. NUTCH-88 - Enhance ParserFactory plugin selection policy
+    (jerome)
+
+ 4. NUTCH-124 - Protocol-httpclient does not follow redirects when 
+    fetching robots.txt (cutting)
+
+ 5. NUTCH-130 - Be explicit about target JVM when building (1.4.x?)
+    ([EMAIL PROTECTED], cutting)
+
+ 6. NUTCH-114 -        Getting number of urls and links from crawldb
+    (Stefan Groschupf via ab)
+
+ 7. NUTCH-112 - Link in cached.jsp page to cached content is an 
+    absolute link (Chris A. Mattmann via jerome)
+
+ 8. NUTCH-135 - Http header meta data are case insensitive in the
+    real world (Stefan Groschupf via jerome)
+
+ 9. NUTCH-145 - Build of war file fails on Chinese (zh) .xml files due
+    to UTF-8 BOM (KuroSaka TeruHiko via siren)
+
+10.    NUTCH-121 - SegmentReader for mapred (Rod Taylor via ab)
+
+11.    Added support for OpenSearch (cutting)
+
+12. NUTCH-142 - NutchConf should use the thread context classloader
+    (Mike Cannon-Brookes via pkosiorowski)
+
+13.    NUTCH-160 - Use standard Java Regex library rather than
+    org.apache.oro.text.regex (Rod Taylor via cutting)
+
+14. NUTCH-151 - CommandRunner can hang after the main thread exec is
+    finished and has inefficient busy loop (Paul Baclace via cutting)
+
+15. NUTCH-174 - Problem encountered with ant during compilation
+
+16.    NUTCH-190 - ParseUtil drops reason for failed parse
+    ([EMAIL PROTECTED] via ab)
+
+17.    NUTCH-169 - Remove static NutchConf (Marko Bauhardt via ab)
+
+18.    NUTCH-194 - Nutch-169 introduced two tiny bugs (Marko Bauhardt via ab)
+
+19. NUTCH-178 - in search.jsp must be session creation "false"
+    (YourSoft via siren)
+
+20. NUTCH-200 - OpenSearch Servlet ist broken
+    (Marko Bauhardt via siren)
+
+21. NUTCH-81 - Webapp only works when deployed in root
+    (AJ Banck, Michael Nebel via siren)
+
+22.    NUTCH-139 - Standard metadata property names in the ParseData
+    metadata (Chris A. Mattmann, jerome)
+
+23. NUTCH-192 - Meta data support for CrawlDatum
+    (Stefan Groschupf via ab)
+    
+24. NUTCH-52 - Parser plugin for MS Excel files
+    (Rohit Kulkarni via jerome)
+
+25. NUTCH-53 -         Parser plugin for Zip files
+    (Rohit Kulkarni via jerome)
+
+26. NUTCH-137 - footer is not displayed in search result page
+    (KuroSaka TeruHiko via siren)
+
+27.    NUTCH-118 - FAQ link points to invalid URL
+    (Steve Betts via siren)
+
+28.    NUTCH-184 - Serbian (sr, Cyrilic) and Serbo-Croatian (sh, Latin)
+    translation (Ivan Sekulovic via siren)
+
+29.    NUTCH-211 - FetchedSegments leave readers open (Stefan Groschupf
+    via cutting)
+
+30. NUTCH-140 - Add alias capability in parse-plugins.xml file that
+    allows mimeType->extensionId mapping (Chris A. Mattmann via jerome)
+
+31. NUTCH-214 - Added Links to web site to search mailling list
+    (Jake Vanderdray via jerome)
+
+32. NUTCH-204 - Multiple field values in HitDetails
+    (Stefan Groschupf via jerome)
 
- 3. Switch from using java.io.File to org.apache.hadoop.fs.Path.
+33.    NUTCH-219 - file.content.limit & ftp.content.limit should be changed
+    to -1 to be consistent with http (jerome)
+    
+34. NUTCH-221 - Prepare nutch for upcoming lucene 2.0 (siren)
+
+35. NUTCH-91 - Empty encoding causes exception (Michael Nebel via
+    pkosiorowski)
+
+36. NUTCH-228 - Clustering plugin descriptor broken (Dawid Weiss via
+    jerome)
+
+37. NUTCH-229 - Improved handling of plugin folder configuration
+    (Stefan Groschupf via ab)
+
+38. NUTCH-206 - Search server throws InstantiationException (ab)
+    
+39. NUTCH-203 - ParseSegment throws InstantiationException (Marko Bauhardt
+    via ab)
+
+40. NUTCH-3 - Multi values of header discarded (Stefan Groschupf via ab)
+
+41. Update to lucene 1.9.1 (cutting)
+
+42. NUTCH-235 - Duplicate Inlink values (ab)
+
+43. NUTCH-234 - Clustering extension code cleanups and a real
+    JUnit test case for the current implementation (Dawid Weiss via ab)
+    
+44. NUTCH-210 - Context.xml file for Nutch web application
+    (Chris A. Mattmann via jerome)
+
+45. NUTCH-231 - Invalid CSS entries (AJ Banck via jerome)
+
+46. NUTCH-232 - Search.jsp has multiple search forms creating
+    invalid html / incorrect focus function (jerome)
+    
+47. NUTCH-196 - lib-xml and lib-log4j plugins (ab, jerome)
+
+48. NUTCH-244 - Inconsistent handling of property values
+    boundaries / unable to set db.max.outlinks.per.page to
+    infinite (jerome)
+    
+49. NUTCH-245 -        DTD for plugin.xml configuration files
+    (Chris A. Mattmann via jerome)
+
+50. NUTCH-250 - Generate to log truncation caused by
+    generate.max.per.host (Rod Taylor via cutting)
+    
+51. NUTCH-125 - OpenOffice Parser plugin (ab)
+
+52. Switch from using java.io.File to org.apache.hadoop.fs.Path.
     (cutting)
+
+53. NUTCH-240 - Scoring API: extension point, scoring filters and
+    an OPIC plugin (ab)
+    
+54. NUTCH-134 - Summarizer doesn't select the best snippets (jerome)
+
+55. NUTCH-268 - Generator and lib-http use different definitions of
+    "unique host" (ab)
+    
+56. NUTCH-280 - Url query causes NullPointerException (Grant Glouser
+    via siren)
+
+57. NUTCH-285 - LinkDb Fails rename doesn't create parent directories
+    (Dennis Kubes via ab)
+
+58. NUTCH-201 - Add support for subcollections
+    (siren)
+
+59. NUTCH-298 - If a 404 for a robots.txt is returned a NPE is thrown
+    (Stefan Groschupf via jerome)
+
+60. NUTCH-275 - Fetcher not parsing XHTML-pages at all (jerome)
+
+61. NUTCH-301 - CommonGrams loads analysis.common.terms.file for each query
+    (Stefan Groschupf via jerome)
+
+62. NUTCH-110 - OpenSearchServlet outputs illegal xml characters
+    ([EMAIL PROTECTED] via siren)
+
+63. NUTCH-292 - OpenSearchServlet: OutOfMemoryError: Java heap space
+    (Stefan Neufeind via siren)
+
+64. NUTCH-307 - Wrong configured log4j.properties (jerome)
+
+65. NUTCH-303 - Logging improvements (jerome)
+
+66. NUTCH-308 - Maximum search time limit (ab)
+
+67.    NUTCH-306 - DistributedSearch.Client liveAddresses concurrency problem
+    (Grant Glouser via siren)
+
+68. Update to hadoop-0.4 (Milind Bhandarkar, cutting)
+
+69. NUTCH-317 - Clarify what the queryLanguage argument of Query.parse(...)
+    means      (jerome)
+
+70. Added alternative experimental web gui in contrib containing extensions 
like
+    subcollection, keymatch, user preferences, caching, implemented mainly 
using tiles and jstl (siren)
 
 
 Release 0.7 - 2005-08-17




-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Nutch-cvs mailing list
Nutch-cvs@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nutch-cvs

Reply via email to