Hi Hoss,
No it wasn't any thing wrong with your suggestions except that they had landed
in my junk mail for some reason, stupid outlook.
However I haven't had any chance testing all of your suggestions but I already
had implemented my own similarity class that has the coord fixed to 1. And it
However the BooleanQuery's disableCoord seems to make effect.
But I still have the problem when I'm constructing queries with wildcards.
/
Marcus
-Ursprungligt meddelande-
Från: Marcus Falck [mailto:[EMAIL PROTECTED]
Skickat: den 12 september 2006 09:34
Till:
Autonomy's KeyView is an alternative to Stellent. It does not cover all of
the file formats that Stellent does, though many of them are probably not
interesting for most applications. When I last looked at it it did not
handle mail archives, though there was a plan to add it. I found it more
As far as I know there isn't a way to do this. What we do is add a
metadata document to each index that includes the creation date, the user
name of the creating user, and various other tidbits. This gets updated on
incremental updates to the index as well. Easily done and makes it easy to
query.
Hi everybody,
is it possible to store fields without term position (the .prx file) data? We
store sort of custom
data in the field and use it as some sort of a filter for queries, so we just
don't need any term
position data and it bloats the index' size nearly by factor 3.
Thanks
Timo
: However the BooleanQuery's disableCoord seems to make effect.
: But I still have the problem when I'm constructing queries with wildcards.
really? ... that's strange, WildcardQuery uses the disableCoord feature of
BooleanQuery. Do you have an example of what you mean?
: already had
Hi Eric/Usergroup,
I am working on a help content index-search project based on Lucene.
One of my requirements is to search for a particular text in the content
of files from specific directories. When I index the content
Eg. guides/accountmanagement/index.htm and
Tom:
great! Now do you do you add metadata? I am new to Lucene API + Java, but
willing to learn.
Got an example?
TIA
On 9/12/06, Tom Emerson [EMAIL PROTECTED] wrote:
As far as I know there isn't a way to do this. What we do is add a
metadata document to each index that includes the creation
I don't know if the use of a DATALINK data type would be relevant in your
case.
Here are some references.
http://publib.boulder.ibm.com/infocenter/db2luw/v8/index.jsp?topic=/com.ibm.db2.udb.doc/start/c0005450.htm
Interestingly, you have extra spaces when you construct your queries, e.g.
queries[2]= accountmanagement has an extra space at the beginning but
when you index the document, there are no spaces. I believe that since
you're indexing the fields UN_TOKENIZED, that the spaces are preserved in
the
Just add another document (I do something similar). The key is to remember
that documents in the same index do NOT have to have the same fields. So,
say for your regular documents, you have fields (f1, f2, f3, f4). For your
meta-data document, you index fields (md1, md2, md3...). The value for
The spaces just came i guess when i copied the code to outlook :-), actually
there arent any. Let me take a look at Luke , especially testing to see what
should be returned when i run the aprsed query.. sounds very interesting..
Thanks a lot
Pramodh
From:
Thanks for the links Michael... this one does look interesting:
http://dev.alt.textdrive.com/browser/lu/LUStringBasicLatin.txt
The challenge would be to make it fast... perhaps a custom hash table,
or look into the cost of a perfect hash function.
Just to clear up some unicode/terminology
It think option B cannot work because due to the MUST operator it requires
both databasemanagement and accountmanagement to be in the subtype
field.
Option A however should work, once the padding blank spaces are removed
from the field name - notice that while the standard analyzer would trim
Thanks for the links Michael... this one does look interesting:
http://dev.alt.textdrive.com/browser/lu/LUStringBasicLatin.txt
The challenge would be to make it fast... perhaps a custom hash table,
or look into the cost of a perfect hash function.
Just to clear up some unicode/terminology
As long as the field is added to the *same* document, I don't see a problem
with option B, although I'll admit that I haven't used
MultiFieldQueryParser. But there was a discussion a while ago about adding
tokens with the same field name to a document via document.add being exactly
the same as
16 matches
Mail list logo