Oct 27 02:08 .svn/
-rw-rw-r-- 1 otis otis 1528 Jun 5 14:27 ThaiAnalyzer.java
-rw-rw-r-- 1 otis otis 2437 Jun 5 14:27 ThaiWordFilter.java
Otis
- Original Message
From: Teruhiko Kurosaka [EMAIL PROTECTED]
To: sanjeev [EMAIL PROTECTED];
nutch-dev@lucene.apache.org
Sent
Sanjay,
I don't think you should follow the Chinese example and extend the CJK
range.
This was needed because Chinese and Japanese don't use space to separate
words. I believe Thai uses spaces, right? If so, you should extend
LETTER
range to include Thai character rather than CJK.
Another place
Please disregard this posting. It was my oversight. build.xml does
have a javacc rule.
So this is just a version difference of javacc?
-kuro
-Original Message-
From: Teruhiko Kurosaka
Sent: 2006-10-18 17:42
To: nutch-dev@lucene.apache.org
Cc: Teruhiko Kurosaka
Subject: What
I am trying to modify the java CC rules in NutchAnalysis.jj.
As a preparation, I ran javacc (ver 3.2) to compile
NutchAnalysis.jj of Nutch 0.8 but the generated
Java files are little bit different than those
found in the src/java directory. Am I supposed to use
some javacc command line options?
I developed a plugin and tried to run it using nutch plugin
plugin-name plugin-fully-qualified-class-name arg1 arg2 of
Nutch 0.8.
But it says my plugin is not present or inactive.
I tried the nutch plugin command with a known plugin
language-identifier as:
./nutch plugin languageidentifier
Hello,
I see many plugins named lib- which are wrappers around
other non-plugin .jar files.
For example, analysis-de plugin uses lib-lucene-analyzers plugin,
which in turn reference to the jar file that contains GermanAnalyzer.
What is the reason for this indirection? The plugins called
by
May I suggest someone take a look at NUTCH-266 before releaseing 0.8?
Nutch build as of half a month ago was not working for me and another
person.
-kuro
-Original Message-
From: Stefan Groschupf [mailto:[EMAIL PROTECTED]
Sent: 2006-7-05 11:53
To: nutch-dev@lucene.apache.org
Thank you for your reply, Sami.
I am not intend to run hadoop at all, so this
hadoop-site.xlm is empty.
...
You should at least set values for 'mapred.system.dir' and
'mapred.local.dir'
and point them to a dir that has enough space available (I think they
default to under /tmp at least on
How about introducing these changes in an effort to force the nutch
admins
to properly edit the bot identity strings?
1. Add the http.agent.* entries to nutch-site.xml with the value being
EDITME.
The description should clearly state that these values *must* be
edited
to reflect the true
Nutch develpers,
I'm writing my a language analyzer and have three questions.
Any pointer will be appreciated.
1. How do I turn on the logging facility?
2. Is there an easy way to run just an analyzer plugin, rather than
running nutch crawl?
3. How do I run debugger (eclipse, in may case) over
Dear Webmaster of
http://lucene.apache.org/nutch/
In the menu bar, under the Documentation heading
there is an item called i18n. The web page
linked from i18n talks about how to translate
(localize) the search GUI. This is not i18n
(internationalization) which should mean designing
and
Jérôme, or anybody familiar with language plugin architecture,
I am writing a language analyzer plugin. This plugin has configurable
parameters, which I am hoping I can add to nutch-site.xml. But
the German and French plugin examples don't access to the
Configuration object. Does the current
Hello Jérôme,
Because of other issues at work, I was away from Nutch.
Now I'm back, and I see you are making progresses according
to your notes in jira.
Is there an API doc or design doc that I can read to
understand where you are? Is the language plugin architecture
already in the main trunk?
13 matches
Mail list logo