how to search documents that value is ;

2004-08-31 Thread juan lu
Hi,

I have a field named bidcode in my Index,the value is like this:

Doc1:;
Doc2:;a0213;a0245
Doc3:;
Doc4:;
Doc5:;a2148;a0214


how can I search all the documents that the value of this field is ;?

I use the  Query like this:
Query query = QueryParser.parse(bidcode:\;\, content, analyzer);  

It find nothing.  Why is that?

Many thanks for help!   


alternative query syntax?

2004-08-31 Thread petite_abeille
Hello,
I would like to provide an alternative query syntax for ranges by using 
a colon (':') or two dots ('..') instead of ' TO '.

For example:
mod_date:[20020101:20030101]
Or
mod_date:[20020101..20030101]
What would be the correct procedure to modify the QueryParser to 
achieve this? Should I simply change QueryParser.jj's RANGEIN_TO and 
RANGEEX_TO to the appropriate character sequence and regenerate the 
corresponding Java classes with JavaCC?

Any pointers appreciated as I'm not familiar with JavaCC :)
TIA.
Cheers,
PA.
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: TR : -- Lucene, error when indexing demo !

2004-08-31 Thread Otis Gospodnetic
JPhD, you have to add the _JAR_ (lucene-1.4.1-final.jar probably) to
your CLASSPATH.  You may also have to add the demo Jar to the
CLASSPATH.

Otis

--- J.Ph DEGLETAGNE [EMAIL PROTECTED] wrote:

 Hello everybody,
  
 I want test lucene,
  
 My configuration : Windows XP...
  
 First, I have extracted the latest Lucene distribution in
 D:\Program Files\Apache Software Foundation to get 
 D:\Program Files\Apache Software Foundation\lucene-1.4-final
 D:\Program Files\Apache Software Foundation\lucene-1.4-final\docs
 D:\Program Files\Apache Software Foundation\lucene-1.4-final\src
 so demo is in
 D:\Program Files\Apache Software Foundation\lucene-1.4-final\src\demo
  
 and I set my classpath like this
 classpath=D:\Program Files\Apache Software
 Foundation\lucene-1.4-final;D:\Program Files\Apache Software
 Foundation\lucene-demos-1.4-final
  
 Second, for indexing files, I type :
 java org.apache.lucene.demo.IndexFiles D:/Program Files/Apache
 Software
 Foundation/lucene-1.4-final/src
 or
 java org.apache.lucene.demo.IndexFiles D:\Program Files\Apache
 Software
 Foundation\lucene-1.4-final\src
 and I got :
 Exception in thread main java.lang.NoClassDefFoundError:
 org/apache/lucene/demo/IndexFiles
  
 Where is my error ?
  
 Thank's a lot !
  
 JPhD
  
 


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: indexing size

2004-08-31 Thread Otis Gospodnetic
Are you using pre-1.4.1 version of Lucene?  There was a bug in one of
the older versions that left multiple, old index files around, instead
of deleting them.  Maybe that's using up the disk space.  Give us your
index directory's 'ls -al' or 'dir'.

Otis

--- Niraj Alok [EMAIL PROTECTED] wrote:

 Hi Guys,
 
 If you have any ideas, please help me out. I have looked into most of
 the
 lucene archives and they are suggesting what I am currently doing. So
 the
 only possible solution for me right now would be to reduce the no. of
 fields
 which could severely change the logic used for searching.
 
 
 Regards,
 Niraj
 - Original Message -
 From: Niraj Alok [EMAIL PROTECTED]
 To: Lucene Users List [EMAIL PROTECTED]
 Sent: Tuesday, August 31, 2004 11:17 AM
 Subject: indexing size
 
 
  Hi,
 
  I am indexing plain xml files , total size of which is around 100
 MB. I am
  creating two indexes for different modules, and they are stored in
 different
  directories as I am not merging them. The problem is that the
 combined
 size
  of these indexes is about 300 MB, ( 3 times the data size), which
 is in
  contrast to the 35% I have read it should create.
  Both these indexes have different fields and different data is
 stored in
  them and hence there is no duplication occuring.
 
  I have one indexwriter for each index. After both the indexes have
 been
  created, I am simply calling optimize on these two writers and
 closing
 them.
 
  Is there something I am doing wrong? I am using
 writer.addDocument(doc).
 
  Regards,
  Niraj
 
 


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Lucene Minor Version ????

2004-08-31 Thread Karthik N S


Hi Guys


Apologies...




Just  was Curious to know

If  Lucene-1.4.1-final.jar  a minor version change  of  Lucene1-4-final.jar
or   

;{

Thx in Advance








  WITH WARM REGARDS
  HAVE A NICE DAY
  [ N.S.KARTHIK]




-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: how to search documents that value is ;

2004-08-31 Thread Otis Gospodnetic
Which Analyzer did you use for indexing, and which one are you using
for searching?  You have to make sure that neither of them discards
that ; character.

Use Luke to double-check that fields in your index still have that ;
character, and then make sure that the Analyzer you are passing to
QueryParser doesn't throw away ; characters.

Otis


--- juan lu [EMAIL PROTECTED] wrote:

 Hi,
 
 I have a field named bidcode in my Index,the value is like this:
 
 Doc1:;
 Doc2:;a0213;a0245
 Doc3:;
 Doc4:;
 Doc5:;a2148;a0214
 ¡­¡­
 
 how can I search all the documents that the value of this field is
 ;?
 
 I use the  Query like this:
 Query query = QueryParser.parse(bidcode:\;\, content,
 analyzer);  
 
 It find nothing.  Why is that?
 
 Many thanks for help!   
 


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Lucene Minor Version ????

2004-08-31 Thread Otis Gospodnetic
Yes.  Minor bug fix, minor version.  See the lnk to CHANGES.txt file on
Lucene's home page.

Otis
P.S.
No need to apologize :)

--- Karthik N S [EMAIL PROTECTED] wrote:

 
 
 Hi Guys
 
 
 Apologies...
 
 
 
 
 Just  was Curious to know
 
 If  Lucene-1.4.1-final.jar  a minor version change  of 
 Lucene1-4-final.jar
 or   
 
 ;{
 
 Thx in Advance
 
 
 
 
 
 
 
 
   WITH WARM REGARDS
   HAVE A NICE DAY
   [ N.S.KARTHIK]
 
 
 
 
 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]
 
 


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: alternative query syntax?

2004-08-31 Thread Otis Gospodnetic
I'm not on very friendly terms with JavaCC either, but I think what you
are saying is correct - it looks like lines 519 and 526 (the ones that
define 'TO') are the ones to change, in Lucene CVS HEAD.

Otis

--- petite_abeille [EMAIL PROTECTED] wrote:

 Hello,
 
 I would like to provide an alternative query syntax for ranges by
 using 
 a colon (':') or two dots ('..') instead of ' TO '.
 
 For example:
 
 mod_date:[20020101:20030101]
 
 Or
 
 mod_date:[20020101..20030101]
 
 What would be the correct procedure to modify the QueryParser to 
 achieve this? Should I simply change QueryParser.jj's RANGEIN_TO and 
 RANGEEX_TO to the appropriate character sequence and regenerate the 
 corresponding Java classes with JavaCC?
 
 Any pointers appreciated as I'm not familiar with JavaCC :)
 
 TIA.
 
 Cheers,
 
 PA.
 
 
 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]
 
 


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: indexing size

2004-08-31 Thread Niraj Alok
Hi Otis,

Here is the dir results: ( I am using 1.3 final)

 Volume in drive C has no label.
 Volume Serial Number is 3767-CD49

 Directory of
C:\eclipse\jakarta-tomcat-5.0.19\webapps\HCPF\WEB-INF\classes\indexall

23/08/2004  10:50DIR  .
23/08/2004  10:50DIR  ..
21/08/2004  17:32 4 deletable
21/08/2004  17:3226 segments
21/08/2004  17:31   183,796 _4dkv.f1
21/08/2004  17:31   183,796 _4dkv.f10
21/08/2004  17:31   183,796 _4dkv.f100
21/08/2004  17:31   183,796 _4dkv.f101
21/08/2004  17:31   183,796 _4dkv.f102
21/08/2004  17:31   183,796 _4dkv.f103
21/08/2004  17:31   183,796 _4dkv.f104
21/08/2004  17:31   183,796 _4dkv.f105
21/08/2004  17:31   183,796 _4dkv.f106
21/08/2004  17:31   183,796 _4dkv.f107
21/08/2004  17:31   183,796 _4dkv.f108
21/08/2004  17:31   183,796 _4dkv.f109
21/08/2004  17:31   183,796 _4dkv.f11
21/08/2004  17:31   183,796 _4dkv.f110
21/08/2004  17:31   183,796 _4dkv.f111
21/08/2004  17:31   183,796 _4dkv.f112
21/08/2004  17:31   183,796 _4dkv.f113
21/08/2004  17:31   183,796 _4dkv.f114
21/08/2004  17:31   183,796 _4dkv.f115
21/08/2004  17:31   183,796 _4dkv.f116
21/08/2004  17:31   183,796 _4dkv.f117
21/08/2004  17:31   183,796 _4dkv.f118
21/08/2004  17:31   183,796 _4dkv.f119
21/08/2004  17:31   183,796 _4dkv.f12
21/08/2004  17:31   183,796 _4dkv.f120
21/08/2004  17:31   183,796 _4dkv.f121
21/08/2004  17:31   183,796 _4dkv.f122
21/08/2004  17:31   183,796 _4dkv.f123
21/08/2004  17:31   183,796 _4dkv.f124
21/08/2004  17:31   183,796 _4dkv.f125
21/08/2004  17:31   183,796 _4dkv.f126
21/08/2004  17:31   183,796 _4dkv.f127
21/08/2004  17:31   183,796 _4dkv.f128
21/08/2004  17:31   183,796 _4dkv.f129
21/08/2004  17:31   183,796 _4dkv.f13
21/08/2004  17:31   183,796 _4dkv.f130
21/08/2004  17:31   183,796 _4dkv.f131
21/08/2004  17:31   183,796 _4dkv.f132
21/08/2004  17:31   183,796 _4dkv.f133
21/08/2004  17:31   183,796 _4dkv.f134
21/08/2004  17:31   183,796 _4dkv.f135
21/08/2004  17:31   183,796 _4dkv.f136
21/08/2004  17:31   183,796 _4dkv.f137
21/08/2004  17:31   183,796 _4dkv.f138
21/08/2004  17:31   183,796 _4dkv.f139
21/08/2004  17:31   183,796 _4dkv.f14
21/08/2004  17:31   183,796 _4dkv.f140
21/08/2004  17:31   183,796 _4dkv.f141
21/08/2004  17:31   183,796 _4dkv.f142
21/08/2004  17:31   183,796 _4dkv.f143
21/08/2004  17:31   183,796 _4dkv.f144
21/08/2004  17:31   183,796 _4dkv.f145
21/08/2004  17:31   183,796 _4dkv.f146
21/08/2004  17:31   183,796 _4dkv.f147
21/08/2004  17:31   183,796 _4dkv.f148
21/08/2004  17:31   183,796 _4dkv.f149
21/08/2004  17:31   183,796 _4dkv.f15
21/08/2004  17:31   183,796 _4dkv.f150
21/08/2004  17:31   183,796 _4dkv.f151
21/08/2004  17:31   183,796 _4dkv.f152
21/08/2004  17:31   183,796 _4dkv.f153
21/08/2004  17:31   183,796 _4dkv.f154
21/08/2004  17:31   183,796 _4dkv.f155
21/08/2004  17:31   183,796 _4dkv.f156
21/08/2004  17:31   183,796 _4dkv.f157
21/08/2004  17:31   183,796 _4dkv.f158
21/08/2004  17:31   183,796 _4dkv.f159
21/08/2004  17:31   183,796 _4dkv.f16
21/08/2004  17:31   183,796 _4dkv.f160
21/08/2004  17:31   183,796 _4dkv.f161
21/08/2004  17:31   183,796 _4dkv.f162
21/08/2004  17:31   183,796 _4dkv.f163
21/08/2004  17:31   183,796 _4dkv.f164
21/08/2004  17:31   183,796 _4dkv.f165
21/08/2004  17:31   183,796 _4dkv.f166
21/08/2004  17:31   183,796 _4dkv.f167
21/08/2004  17:31   183,796 _4dkv.f168
21/08/2004  17:31   183,796 _4dkv.f169
21/08/2004  17:31   183,796 _4dkv.f17
21/08/2004  17:31   183,796 _4dkv.f170
21/08/2004  17:31   183,796 _4dkv.f171
21/08/2004  17:31   183,796 _4dkv.f172
21/08/2004  17:31   183,796 _4dkv.f173
21/08/2004  17:31   183,796 _4dkv.f174
21/08/2004  17:31   183,796 _4dkv.f175
21/08/2004  17:31   183,796 _4dkv.f176
21/08/2004  17:31   183,796 _4dkv.f177
21/08/2004  17:31   183,796 _4dkv.f178
21/08/2004  17:31   183,796 _4dkv.f179
21/08/2004  17:31   183,796 _4dkv.f18
21/08/2004  17:31   183,796 _4dkv.f180
21/08/2004  17:31   183,796 _4dkv.f181
21/08/2004  17:31   183,796 _4dkv.f182
21/08/2004  17:31   183,796 _4dkv.f183
21/08/2004  17:31   183,796 _4dkv.f184
21/08/2004  17:31   183,796 _4dkv.f185
21/08/2004  17:31   183,796 _4dkv.f186
21/08/2004  17:31   183,796 _4dkv.f187
21/08/2004  17:31

Re: indexing size

2004-08-31 Thread Otis Gospodnetic
This looks optimized and healthy.  You also have a large number of
fields, and it looks like a lot (all?) of them are stored and indexed.
That's what that large .fdt file indicated.  That file is  206 MB in
size.  Have you looked into your index to make sure you don't have
duplicate Documents in there?  Lucene will allow duplicate Documents,
because there is no Document uniqueness/PK-like functionality.

Otis

--- Niraj Alok [EMAIL PROTECTED] wrote:

 Hi Otis,
 
 Here is the dir results: ( I am using 1.3 final)
 
  Volume in drive C has no label.
  Volume Serial Number is 3767-CD49
 
  Directory of

C:\eclipse\jakarta-tomcat-5.0.19\webapps\HCPF\WEB-INF\classes\indexall
 
 23/08/2004  10:50DIR  .
 23/08/2004  10:50DIR  ..
 21/08/2004  17:32 4 deletable
 21/08/2004  17:3226 segments
 21/08/2004  17:31   183,796 _4dkv.f1
 21/08/2004  17:31   183,796 _4dkv.f10
 21/08/2004  17:31   183,796 _4dkv.f100
 21/08/2004  17:31   183,796 _4dkv.f101
 21/08/2004  17:31   183,796 _4dkv.f102
 21/08/2004  17:31   183,796 _4dkv.f103
 21/08/2004  17:31   183,796 _4dkv.f104
 21/08/2004  17:31   183,796 _4dkv.f105
 21/08/2004  17:31   183,796 _4dkv.f106
 21/08/2004  17:31   183,796 _4dkv.f107
 21/08/2004  17:31   183,796 _4dkv.f108
 21/08/2004  17:31   183,796 _4dkv.f109
 21/08/2004  17:31   183,796 _4dkv.f11
 21/08/2004  17:31   183,796 _4dkv.f110
 21/08/2004  17:31   183,796 _4dkv.f111
 21/08/2004  17:31   183,796 _4dkv.f112
 21/08/2004  17:31   183,796 _4dkv.f113
 21/08/2004  17:31   183,796 _4dkv.f114
 21/08/2004  17:31   183,796 _4dkv.f115
 21/08/2004  17:31   183,796 _4dkv.f116
 21/08/2004  17:31   183,796 _4dkv.f117
 21/08/2004  17:31   183,796 _4dkv.f118
 21/08/2004  17:31   183,796 _4dkv.f119
 21/08/2004  17:31   183,796 _4dkv.f12
 21/08/2004  17:31   183,796 _4dkv.f120
 21/08/2004  17:31   183,796 _4dkv.f121
 21/08/2004  17:31   183,796 _4dkv.f122
 21/08/2004  17:31   183,796 _4dkv.f123
 21/08/2004  17:31   183,796 _4dkv.f124
 21/08/2004  17:31   183,796 _4dkv.f125
 21/08/2004  17:31   183,796 _4dkv.f126
 21/08/2004  17:31   183,796 _4dkv.f127
 21/08/2004  17:31   183,796 _4dkv.f128
 21/08/2004  17:31   183,796 _4dkv.f129
 21/08/2004  17:31   183,796 _4dkv.f13
 21/08/2004  17:31   183,796 _4dkv.f130
 21/08/2004  17:31   183,796 _4dkv.f131
 21/08/2004  17:31   183,796 _4dkv.f132
 21/08/2004  17:31   183,796 _4dkv.f133
 21/08/2004  17:31   183,796 _4dkv.f134
 21/08/2004  17:31   183,796 _4dkv.f135
 21/08/2004  17:31   183,796 _4dkv.f136
 21/08/2004  17:31   183,796 _4dkv.f137
 21/08/2004  17:31   183,796 _4dkv.f138
 21/08/2004  17:31   183,796 _4dkv.f139
 21/08/2004  17:31   183,796 _4dkv.f14
 21/08/2004  17:31   183,796 _4dkv.f140
 21/08/2004  17:31   183,796 _4dkv.f141
 21/08/2004  17:31   183,796 _4dkv.f142
 21/08/2004  17:31   183,796 _4dkv.f143
 21/08/2004  17:31   183,796 _4dkv.f144
 21/08/2004  17:31   183,796 _4dkv.f145
 21/08/2004  17:31   183,796 _4dkv.f146
 21/08/2004  17:31   183,796 _4dkv.f147
 21/08/2004  17:31   183,796 _4dkv.f148
 21/08/2004  17:31   183,796 _4dkv.f149
 21/08/2004  17:31   183,796 _4dkv.f15
 21/08/2004  17:31   183,796 _4dkv.f150
 21/08/2004  17:31   183,796 _4dkv.f151
 21/08/2004  17:31   183,796 _4dkv.f152
 21/08/2004  17:31   183,796 _4dkv.f153
 21/08/2004  17:31   183,796 _4dkv.f154
 21/08/2004  17:31   183,796 _4dkv.f155
 21/08/2004  17:31   183,796 _4dkv.f156
 21/08/2004  17:31   183,796 _4dkv.f157
 21/08/2004  17:31   183,796 _4dkv.f158
 21/08/2004  17:31   183,796 _4dkv.f159
 21/08/2004  17:31   183,796 _4dkv.f16
 21/08/2004  17:31   183,796 _4dkv.f160
 21/08/2004  17:31   183,796 _4dkv.f161
 21/08/2004  17:31   183,796 _4dkv.f162
 21/08/2004  17:31   183,796 _4dkv.f163
 21/08/2004  17:31   183,796 _4dkv.f164
 21/08/2004  17:31   183,796 _4dkv.f165
 21/08/2004  17:31   183,796 _4dkv.f166
 21/08/2004  17:31   183,796 _4dkv.f167
 21/08/2004  17:31   183,796 _4dkv.f168
 21/08/2004  17:31   183,796 _4dkv.f169
 21/08/2004  17:31   183,796 _4dkv.f17
 21/08/2004  17:31   183,796 _4dkv.f170
 21/08/2004  17:31   183,796 _4dkv.f171
 21/08/2004  17:31   183,796 _4dkv.f172
 21/08/2004  17:31   183,796 _4dkv.f173
 21/08/2004  17:31   183,796 _4dkv.f174
 21/08/2004  17:31   183,796 _4dkv.f175
 21/08/2004  17:31   183,796 _4dkv.f176
 21/08/2004  17:31   

Re: indexing size

2004-08-31 Thread petite_abeille
On Aug 31, 2004, at 17:17, Otis Gospodnetic wrote:
You also have a large number of
fields, and it looks like a lot (all?) of them are stored and indexed.
That's what that large .fdt file indicated.  That file is  206 MB in
size.
Try using Field.UnStored() to avoid storing all those data in your 
indices as it's usually not necessary.

PA.
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-- TomCat/Lucene, filesystem

2004-08-31 Thread J.Ph DEGLETAGNE
Hello Somebody, 
 
..I beg your pardon... 
 
Under Windows XP / TomCat, 
 
How to customize  Webapp Lucene to access directory filesystem which are
outside TomCat ?
like this :
D:\Program Files\Apache Software Foundation\Tomcat 5.0\..
to access
E:\Data
 
Thank's a lot
 
JPhD


RE: -- TomCat/Lucene, filesystem

2004-08-31 Thread Rupinder Singh Mazara
i have a web application using  lucene via tomcat,
you may need to set 
the correct permissions in ur catalina.policy file 

i use a blanket policy of
grant  {
   permission java.io.FilePermission   /,read;
};

to manage allow access to lucene 


-Original Message-
From: J.Ph DEGLETAGNE [mailto:[EMAIL PROTECTED]
Sent: 31 August 2004 17:12
To: [EMAIL PROTECTED]; [EMAIL PROTECTED]
Subject: -- TomCat/Lucene, filesystem


Hello Somebody, 
 
..I beg your pardon... 
 
Under Windows XP / TomCat, 
 
How to customize  Webapp Lucene to access directory filesystem which are
outside TomCat ?
like this :
D:\Program Files\Apache Software Foundation\Tomcat 5.0\..
to access
E:\Data
 
Thank's a lot
 
JPhD


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Lucene 1.4.1 not listed on jakarta downloads page

2004-08-31 Thread Armbrust, Daniel C.
FYI
I was able to find Lucene 1.4.1 here:  
http://cvs.apache.org/dist/jakarta/lucene/v1.4.1/

But if I go here:
http://jakarta.apache.org/site/binindex.cgi

1.4 is the only lucene download option available.

Dan

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Lucene 1.4.1 not listed on jakarta downloads page

2004-08-31 Thread Erik Hatcher
Thanks for reporting this.  We actually know.  Lucene 1.4.1 was not 
released properly, and it is going to require someone to do so.

I've done the last two releases, but have been swamped lately.  If no 
one beats me to it, I'll hopefully get around to this in the near 
future.

Erik
On Aug 31, 2004, at 1:39 PM, Armbrust, Daniel C. wrote:
FYI
I was able to find Lucene 1.4.1 here:  
http://cvs.apache.org/dist/jakarta/lucene/v1.4.1/

But if I go here:
http://jakarta.apache.org/site/binindex.cgi
1.4 is the only lucene download option available.
Dan
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


.net version - seperate mailinglist?

2004-08-31 Thread Jan Agermose
Hi
 
Im having some troble using Lucene - but is the .NET port. Should I ask questions 
about the different analyzers and tokenizers on this mailinglist or one some other?
 
Also - does anyone know the difference between the to .NET implementations listed on 
the Lucene website? 
 
http://jakarta.apache.org/lucene/docs/resources.html
 
Best Regards
Jan Agermose


Re: indexing size

2004-08-31 Thread Niraj Alok
I was also thinking on the same lines.
Actually the original code was written by some one else who has left and so
I have to own this.

At almost all the places, it is Field.Text and at some few places its
Field.UnIndexed.
I looked at the javadocs and found that there is Field.UnStored also.

The problem is I am not too sure which one to change to what. It would be
really enlightening if you could point the differences
between those three and what would I need to change in my search code.

If I make some of them Field.Unstored, I can see from the javadocs that it
will be indexed and tokenized but not stored. If it is not stored, how can I
use it while searching? Basically what is meant by indexed and stored,
indexed and not stored and not indexed and stored?


Regards,
Niraj
- Original Message -
From: petite_abeille [EMAIL PROTECTED]
To: Lucene Users List [EMAIL PROTECTED]
Sent: Tuesday, August 31, 2004 8:57 PM
Subject: Re: indexing size



 On Aug 31, 2004, at 17:17, Otis Gospodnetic wrote:

  You also have a large number of
  fields, and it looks like a lot (all?) of them are stored and indexed.
  That's what that large .fdt file indicated.  That file is  206 MB in
  size.

 Try using Field.UnStored() to avoid storing all those data in your
 indices as it's usually not necessary.

 PA.


 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]




Re: indexing size

2004-08-31 Thread Stephane James Vaucher
On Wed, 1 Sep 2004, Niraj Alok wrote
 I was also thinking on the same lines.
 Actually the original code was written by some one else who has left and so
 I have to own this.

 At almost all the places, it is Field.Text and at some few places its
 Field.UnIndexed.
 I looked at the javadocs and found that there is Field.UnStored also.

 The problem is I am not too sure which one to change to what. It would be
 really enlightening if you could point the differences
 between those three and what would I need to change in my search code.

 If I make some of them Field.Unstored, I can see from the javadocs that
 it will be indexed and tokenized but not stored. If it is not stored,
 how can I use it while searching? Basically what is meant by indexed and
 stored, indexed and not stored and not indexed and stored?

If all you need is to seach a field, you do not need to store it. If it is
not stored it can still be tokenised and analysed by lucene. It will then
be only stored as a set of token, but not as whole. You can thus use it
for fields that you never need to retrieve from the index.

For example:
the quick brown fox jumped over the lazy dog.

will be store in lucene only as tokens, not as a whole, so using a
whitespace analyser using a stopword list {the}:

You will have these tokens in lucene:
quick
brown
fox
jumped
over
dog

You will NOT be able to retrieve the original text, but you will be able
to search it.

HTH,
sv


 Regards,
 Niraj
 - Original Message -
 From: petite_abeille [EMAIL PROTECTED]
 To: Lucene Users List [EMAIL PROTECTED]
 Sent: Tuesday, August 31, 2004 8:57 PM
 Subject: Re: indexing size


 
  On Aug 31, 2004, at 17:17, Otis Gospodnetic wrote:
 
   You also have a large number of
   fields, and it looks like a lot (all?) of them are stored and indexed.
   That's what that large .fdt file indicated.  That file is  206 MB in
   size.
 
  Try using Field.UnStored() to avoid storing all those data in your
  indices as it's usually not necessary.
 
  PA.
 
 
  -
  To unsubscribe, e-mail: [EMAIL PROTECTED]
  For additional commands, e-mail: [EMAIL PROTECTED]
 
 



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]