Munir wrote:
Can you please tell me if it is possible to use NGramProfile to create
arabic profile? if it is ok how? because I tried to run this command
but I got error:
java org.apache.nutch.analysis.lang.NGramProfile -create <ar> <arabic>
<windows-1256>
error : syntax error near unexpected token `<'
and how I will create ar.ngp file?
Please help me to use the Analysis for Arabic. can you tell me the
steps one by one?
I am using Nutch0.9dev on lunix with tomcat5.5.20.
thanks in advance
Yes you can,
Since Ngram profile uses java.lang.String and that is UTF-16, you can
create ngram profile for arabic (I suppose arabic character set is
represented in utf16). You should not use '<' and '>' but instead give
the command as :
java org.apache.nutch.analysis.lang.NGramProfile -create ar arabic windows-1256
linux uses < and > for redirecting standard input and output.