hi,
Plz give the idea about NGRAM INDEX.
While i am search net, i want to the below step,
1. compile nutch (in top level dir do "ant")
2. crawl your data (see tutorial)
3. edit your conf/nutch-site.xml so it contains plugin
"web-query-propose-spellcheck" and "webui-extensionpoints"
4. edit conf/nutch-site.xml so it contains proper dir for plugins as the
plugins are not packaged inside .war (something like
<property>
<name>plugin.folders</name>
<value> <path to plugins dir> </value>
</property>
)
5. compile web2 plugins (in contrib/web2 do ant compile-plugins)
6. edit search.jsp contains line "<tiles:insert definition="propose"
ignore="true"/>" just before the second c:choose.
7. create web2 app (in contrib/web2 do ant war)
8. build your spell check index ( bin/nutch plugin
web-query-propose-spellcheck org.apache.nutch.spell.NGramSpeller -i
<indexdir> -f content -o spelling
9. deploy webapp to tomcat
i did the first 4 step. but i got some error, when i doing 5 step.
error as follows:-
contrib/web2# ant compile-plugins
Buildfile: build.xml
init:
compile-plugins:
deploy:
init:
init-plugin:
compile:
jar:
[jar] Warning: skipping jar archive
/opt/RootCrawl/nutch-0.9/trunk/build/webui-extensionpoints/webui-extensionpoints.jar
because no files were included.
deps-test:
deploy:
init:
init-plugin:
[echo] Copying UI configuration
[echo] Copying UI templates
deps-jar:
prepare-web:
[delete] Deleting directory
/opt/RootCrawl/nutch-0.9/trunk/build/web-caching-oscache/tmp/_web
[copy] Copying 5 files to
/opt/RootCrawl/nutch-0.9/trunk/build/web-caching-oscache/tmp/_web
compile-jsp:
compile:
[echo] Compiling plugin: web-caching-oscache
[javac] Compiling 4 source files to
/opt/RootCrawl/nutch-0.9/trunk/build/web-caching-oscache/classes
[javac]
/opt/RootCrawl/nutch-0.9/trunk/contrib/web2/plugins/web-caching-oscache/src/java/org/apache/nutch/webapp/CacheManager.java:32:
package org.apache.nutch.webapp.common does not exist
[javac] import org.apache.nutch.webapp.common.Search;
[javac] ^
[javac]
/opt/RootCrawl/nutch-0.9/trunk/contrib/web2/plugins/web-caching-oscache/src/java/org/apache/nutch/webapp/CacheManager.java:33:
package org.apache.nutch.webapp.common does not exist
[javac] import org.apache.nutch.webapp.common.ServiceLocator;
[javac] ^
[javac]
/opt/RootCrawl/nutch-0.9/trunk/contrib/web2/plugins/web-caching-oscache/src/java/org/apache/nutch/webapp/CacheManager.java:127:
cannot find symbol
[javac] symbol : class ServiceLocator
[javac] location: class org.apache.nutch.webapp.CacheManager
[javac] public Search getSearch(String id, ServiceLocator locator)
throws NeedsRefreshException {
so, i skip 5 step, again 7 step stop i got some error.
error as follows:-
contrib/web2# ant war
Buildfile: build.xml
generate-context:
init:
generate-pages:
generate-locale:
[echo] Generating docs for locale=ca
Warning: Reference docDTDs has not been set at runtime, but was found during
build file parsing, attempting to resolve. Future versions of Ant may
support
referencing ids defined in non-executed targets.
[xslt] Transforming into
/opt/RootCrawl/nutch-0.9/trunk/contrib/web2/target/nutch-web2/ca
generate-locale:
[echo] Generating docs for locale=de
Warning: Reference docDTDs has not been set at runtime, but was found during
build file parsing, attempting to resolve. Future versions of Ant may
support
referencing ids defined in non-executed targets.
[xslt] Processing
/opt/RootCrawl/nutch-0.9/trunk/src/web/include/de/header.xml to
/opt/RootCrawl/nutch-0.9/trunk/contrib/web2/target/docs/de/include/header.html
[xslt] Loading stylesheet
/opt/RootCrawl/nutch-0.9/trunk/contrib/web2/res/nutch-header.xsl
[xslt] : Error! null
[xslt] : Error! java.lang.IllegalArgumentException
[xslt] Failed to process
/opt/RootCrawl/nutch-0.9/trunk/src/web/include/de/header.xml
So, Plz let me know any body having idea about this,
i am waiting for ur reply,
Thanks in advance,,
--
View this message in context:
http://www.nabble.com/how-to-create-NGRAM-INDEX-tf4111020.html#a11689163
Sent from the Nutch - User mailing list archive at Nabble.com.
-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general