How are you loading the document into the content variable below? My
guess is still that you have different locales on Windows and Ubuntu.
(Btw, sorry about the java-user comment. I should wake up before
sending responses. For some reason I thought the email was sent to
java-dev)
-Grant
On Feb 18, 2008, at 7:44 AM, kratoras wrote:
Actually what i figured out just now is that the problem is on the
indexing
part. A document with a 15MB size is transformed in a 23MB index
which is
not normal since on windows for the same document the index is 3MB.
For the
indexing i use:
writer = new IndexWriter(index, new GreekAnalyzer(), !index.exists());
and to add documents:
doc.add(new
Field("contents",content,Field.Store.YES,Field.Index.TOKENIZED));
where "content" is a string with the content of the document. Should i
convert this string to UTF-8 using getBytes before i write it to the
index??
--
View this message in context:
http://www.nabble.com/Problem-using-Lucene-on-Ubuntu-tp15543843p15544612.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
--------------------------
Grant Ingersoll
http://lucene.grantingersoll.com
http://www.lucenebootcamp.com
Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]