Re: [CLucene-dev] Big index size

2011-02-05 Thread Ahmed Saidi
The problem is solved, it was my mistake, by accident i have stored the file text without tokenization in the categorie field! Thanks for your help. Ahmed 2011/2/3, Ben van Klinken : > Stored fields are kept as plain text. It is possible to compress the > fields if it is a lot of data, but you co

Re: [CLucene-dev] Big index size

2011-02-03 Thread Ben van Klinken
Stored fields are kept as plain text. It is possible to compress the fields if it is a lot of data, but you could look into not storing certain fields (but of course you won't be able to retrieve the data out of the document after a search). depending on your requirements this may be interesting.

Re: [CLucene-dev] Big index size

2011-02-03 Thread Ahmed Saidi
i'm using an arabic analyzer, it analyze only arabic characters, please see the attached file. there is no duplicate document, and no IndexReader is open. Ahmed 2011/2/3 Ahmed Saidi > i'm using an arabic analyzer, it analyze only arabic characters, please see > the attached file. > there is no

Re: [CLucene-dev] Big index size

2011-02-03 Thread Ahmed Saidi
i'm using an arabic analyzer, it analyze only arabic characters, please see the attached file. there is no duplicate document, and no IndexReader is open. Ahmed 2011/2/3 Veit Jahns > 2011/2/2 Ahmed Saidi : > > Even after optimizing the index, the size is 20 gb. The size of the > > data which i w

Re: [CLucene-dev] Big index size

2011-02-03 Thread Veit Jahns
2011/2/2 Ahmed Saidi : > Even after optimizing the index, the size is 20 gb. The size of the > data which i want to index is about 8 GB. Strange indeed. Just some further questions which came into my mind: - What kind of analyzer do you use for tokenizing? - Is the correct number of documents in

Re: [CLucene-dev] Big index size

2011-02-02 Thread Ahmed Saidi
Even after optimizing the index, the size is 20 gb. The size of the data which i want to index is about 8 GB. if i add a set of fields that have the same values to the index, will clucene do any kind of compression? Ahmed 2011/2/1, Veit Jahns : > Hi Ahmed! > > 2011/2/1 Ahmed Saidi : >> I'm using

Re: [CLucene-dev] Big index size

2011-02-01 Thread Veit Jahns
Hi Ahmed! 2011/2/1 Ahmed Saidi : > I'm using clucene to index a large set of files, the index size was > about 2 GB, after adding tree fildes that contient a numbrer such as > categorie, author id, those fields are not tokinized but stored in the > index, and a large set of document have the same