Author: ab
Date: Mon Sep 19 07:11:07 2005
New Revision: 290163
URL: http://svn.apache.org/viewcvs?rev=290163&view=rev
Log:
Update of the clustering plugin, contributed by Dawid Weiss.
Carrot2 components updated to the newest stable versions. Improvements in
tokenizers (speedups) and stop words handling. Internal API changed slightly
(update needed if anyone wants to use other Carrot2 components and uses this
code as a glue). Support added for Danish, Finnish, Norwegian (bokmaal) and
Swedish.
Added:
lucene/nutch/branches/Release-0.7/src/plugin/clustering-carrot2/lib/commons-collections-3.1-patched.jar
(with props)
lucene/nutch/branches/Release-0.7/src/plugin/clustering-carrot2/lib/log4j-1.2.11.jar
(with props)
Removed:
lucene/nutch/branches/Release-0.7/src/plugin/clustering-carrot2/lib/commons-collections-3.0.jar
lucene/nutch/branches/Release-0.7/src/plugin/clustering-carrot2/lib/log4j-1.2.8.jar
Modified:
lucene/nutch/branches/Release-0.7/src/plugin/clustering-carrot2/lib/carrot2-filter-lingo.jar
lucene/nutch/branches/Release-0.7/src/plugin/clustering-carrot2/lib/carrot2-local-core.jar
lucene/nutch/branches/Release-0.7/src/plugin/clustering-carrot2/lib/carrot2-snowball-stemmers.jar
lucene/nutch/branches/Release-0.7/src/plugin/clustering-carrot2/lib/carrot2-util-common.jar
lucene/nutch/branches/Release-0.7/src/plugin/clustering-carrot2/lib/carrot2-util-tokenizer.jar
lucene/nutch/branches/Release-0.7/src/plugin/clustering-carrot2/lib/carrot2.CONTRIBUTORS
lucene/nutch/branches/Release-0.7/src/plugin/clustering-carrot2/lib/carrot2.LICENSE
lucene/nutch/branches/Release-0.7/src/plugin/clustering-carrot2/lib/commons-pool.LICENSE
lucene/nutch/branches/Release-0.7/src/plugin/clustering-carrot2/plugin.xml
Modified:
lucene/nutch/branches/Release-0.7/src/plugin/clustering-carrot2/lib/carrot2-filter-lingo.jar
URL:
http://svn.apache.org/viewcvs/lucene/nutch/branches/Release-0.7/src/plugin/clustering-carrot2/lib/carrot2-filter-lingo.jar?rev=290163&r1=290162&r2=290163&view=diff
==============================================================================
Binary files - no diff available.
Modified:
lucene/nutch/branches/Release-0.7/src/plugin/clustering-carrot2/lib/carrot2-local-core.jar
URL:
http://svn.apache.org/viewcvs/lucene/nutch/branches/Release-0.7/src/plugin/clustering-carrot2/lib/carrot2-local-core.jar?rev=290163&r1=290162&r2=290163&view=diff
==============================================================================
Binary files - no diff available.
Modified:
lucene/nutch/branches/Release-0.7/src/plugin/clustering-carrot2/lib/carrot2-snowball-stemmers.jar
URL:
http://svn.apache.org/viewcvs/lucene/nutch/branches/Release-0.7/src/plugin/clustering-carrot2/lib/carrot2-snowball-stemmers.jar?rev=290163&r1=290162&r2=290163&view=diff
==============================================================================
Binary files - no diff available.
Modified:
lucene/nutch/branches/Release-0.7/src/plugin/clustering-carrot2/lib/carrot2-util-common.jar
URL:
http://svn.apache.org/viewcvs/lucene/nutch/branches/Release-0.7/src/plugin/clustering-carrot2/lib/carrot2-util-common.jar?rev=290163&r1=290162&r2=290163&view=diff
==============================================================================
Binary files - no diff available.
Modified:
lucene/nutch/branches/Release-0.7/src/plugin/clustering-carrot2/lib/carrot2-util-tokenizer.jar
URL:
http://svn.apache.org/viewcvs/lucene/nutch/branches/Release-0.7/src/plugin/clustering-carrot2/lib/carrot2-util-tokenizer.jar?rev=290163&r1=290162&r2=290163&view=diff
==============================================================================
Binary files - no diff available.
Modified:
lucene/nutch/branches/Release-0.7/src/plugin/clustering-carrot2/lib/carrot2.CONTRIBUTORS
URL:
http://svn.apache.org/viewcvs/lucene/nutch/branches/Release-0.7/src/plugin/clustering-carrot2/lib/carrot2.CONTRIBUTORS?rev=290163&r1=290162&r2=290163&view=diff
==============================================================================
---
lucene/nutch/branches/Release-0.7/src/plugin/clustering-carrot2/lib/carrot2.CONTRIBUTORS
(original)
+++
lucene/nutch/branches/Release-0.7/src/plugin/clustering-carrot2/lib/carrot2.CONTRIBUTORS
Mon Sep 19 07:11:07 2005
@@ -5,9 +5,10 @@
#
# First name, surname name; Duties; Active from; Institution
-Dawid Weiss; Project administrator, various components, core; 2002; Poznan University of Technology, Poland
-StanisÅaw, OsiÅski; Lingo clustering component, ODP Input; 2003; Poznan
University of Technology, Poland
+Dawid Weiss; Project administrator, various components, core; 2002; Poland
+StanisÅaw, OsiÅski; Lingo clustering component, ODP Input; 2003; Poland
+
MichaÅ, Wróblewski [*]; AHC clustering components; 2003; Poznan University
of Technology, Poland
PaweÅ, Kowalik [*]; Inductive search engine wrapper; 2003; Poznan University
of Technology, Poland
-Steven, Schockaert; Fuzzy Ants clustering component; 2004; University of Gent,
Belgium
-Lang [,] Ngo Chi; Fuzzy Rough set clustering component; 2004; Warsaw
University, Poland
+Steven, Schockaert [*]; Fuzzy Ants clustering component; 2004; University of
Gent, Belgium
+Lang, Ngo Chi [*]; Fuzzy Rough set clustering component; 2004; Warsaw
University, Poland
Modified:
lucene/nutch/branches/Release-0.7/src/plugin/clustering-carrot2/lib/carrot2.LICENSE
URL:
http://svn.apache.org/viewcvs/lucene/nutch/branches/Release-0.7/src/plugin/clustering-carrot2/lib/carrot2.LICENSE?rev=290163&r1=290162&r2=290163&view=diff
==============================================================================
---
lucene/nutch/branches/Release-0.7/src/plugin/clustering-carrot2/lib/carrot2.LICENSE
(original)
+++
lucene/nutch/branches/Release-0.7/src/plugin/clustering-carrot2/lib/carrot2.LICENSE
Mon Sep 19 07:11:07 2005
@@ -1,6 +1,6 @@
Carrot2 Project
-Copyright (C) 2002-2004, Dawid Weiss
+Copyright (C) Dawid Weiss, Stanislaw Osinski
Portions (C) Contributors listed in carrot2.CONTRIBUTORS file.
Redistribution and use in source and binary forms, with or without
modification,
Added:
lucene/nutch/branches/Release-0.7/src/plugin/clustering-carrot2/lib/commons-collections-3.1-patched.jar
URL:
http://svn.apache.org/viewcvs/lucene/nutch/branches/Release-0.7/src/plugin/clustering-carrot2/lib/commons-collections-3.1-patched.jar?rev=290163&view=auto
==============================================================================
Binary file - no diff available.
Propchange:
lucene/nutch/branches/Release-0.7/src/plugin/clustering-carrot2/lib/commons-collections-3.1-patched.jar
------------------------------------------------------------------------------
svn:mime-type = application/octet-stream
Modified:
lucene/nutch/branches/Release-0.7/src/plugin/clustering-carrot2/lib/commons-pool.LICENSE
URL:
http://svn.apache.org/viewcvs/lucene/nutch/branches/Release-0.7/src/plugin/clustering-carrot2/lib/commons-pool.LICENSE?rev=290163&r1=290162&r2=290163&view=diff
==============================================================================
---
lucene/nutch/branches/Release-0.7/src/plugin/clustering-carrot2/lib/commons-pool.LICENSE
(original)
+++
lucene/nutch/branches/Release-0.7/src/plugin/clustering-carrot2/lib/commons-pool.LICENSE
Mon Sep 19 07:11:07 2005
@@ -1,6 +1,6 @@
/*
- * $Revision: 1.1 $
- * $Date: 2004/08/09 23:23:53 $
+ * $Revision: 1.2 $
+ * $Date: 2004/06/19 16:26:16 $
*
* ====================================================================
*
Added:
lucene/nutch/branches/Release-0.7/src/plugin/clustering-carrot2/lib/log4j-1.2.11.jar
URL:
http://svn.apache.org/viewcvs/lucene/nutch/branches/Release-0.7/src/plugin/clustering-carrot2/lib/log4j-1.2.11.jar?rev=290163&view=auto
==============================================================================
Binary file - no diff available.
Propchange:
lucene/nutch/branches/Release-0.7/src/plugin/clustering-carrot2/lib/log4j-1.2.11.jar
------------------------------------------------------------------------------
svn:mime-type = application/octet-stream
Modified:
lucene/nutch/branches/Release-0.7/src/plugin/clustering-carrot2/plugin.xml
URL:
http://svn.apache.org/viewcvs/lucene/nutch/branches/Release-0.7/src/plugin/clustering-carrot2/plugin.xml?rev=290163&r1=290162&r2=290163&view=diff
==============================================================================
--- lucene/nutch/branches/Release-0.7/src/plugin/clustering-carrot2/plugin.xml
(original)
+++ lucene/nutch/branches/Release-0.7/src/plugin/clustering-carrot2/plugin.xml
Mon Sep 19 07:11:07 2005
@@ -18,11 +18,11 @@
<library name="carrot2-util-tokenizer.jar"/>
<library name="colt-1.0.3.jar"/>
- <library name="commons-collections-3.0.jar"/>
+ <library name="commons-collections-3.1-patched.jar"/>
<library name="commons-pool-1.1.jar"/>
<library name="FSA.jar"/>
<library name="Jama-1.0.1-patched.jar"/>
- <library name="log4j-1.2.8.jar"/>
+ <library name="log4j-1.2.11.jar"/>
<library name="nekohtml-0.9.2.jar"/>
</runtime>