Hello,
I need to index arrays of long, usually of long[20], 20 in length.
Its been a while since I worked with lucene, last time was probably <
version 3.
I read
https://lucene.apache.org/core/6_2_0/core/org/apache/lucene/document/Field.html
There are SortedDocValuesField and SortedSetDocValuesF
Hello,
Do I need to add a key, if I will not be
a. updating the document
b. will not fetch the document by key?
What could be the possible downside of not using a key that uniquely
identifies the document?
I am building a log processor, and all I will do is sort and iterate.
Best regards,
C.
ents are otherwise
> tiny, to add one if you don't really need it.
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
> On Mon, Sep 12, 2016 at 5:42 AM, Cam Bazz wrote:
> > Hello,
> >
> > Do I need to add a key, if I will not be
> >
> >
Hello,
I have a field called timeSlot in my documents, basically representing an
hour.
When a query is made, I would like to make a graph of how many doc hits
corresponds to each timeSlot, sort it and display a chart of it.
I am simply using term queries, to query StringFields, and here is my
re
Hello,
FacetResult getTopChildren returns the top N facets, however I need to
return facets where count is above a certain threshold, for example return
all facets that had counts > 10.
Is there a way to accomplish this? I have been looking over the API docs
and could not find it. I could maybe g
Hello,
I am indexing userAgent fields found in apache logs. Indexing and querying
everything with
KeywordAnalyzer - But I found something strange:
IndexSearcher searcher = new IndexSearcher(reader);
Analyzer q_analyzer = new KeywordAnalyzer();
QueryParser pars
) {
this.lowercaseExpandedTerms = lowercaseExpandedTerms;
}
Query parser lowercases the queries only if it is a wildcard, prefix ,
fuzzy and range query. and it can be turned off by
parser.setLowerCaseExpandedTerms(false);
Which solved my problem,
Best regards,
C.
On Thu, Sep 22, 2016 at 5:01 PM, Cam Bazz
hello,
how could I possibly get a select a random document out of a document
collection inside a lucene index?
best regards,
-C.B.
Hello,
Recently I developed an interest in making a lucene based structure for
tagging. As we all know lucene's update is not real-time and one has to
delete a document prior to updating it.
I have been googling for different approaches to a lucene based tagging
structure, and I stumbled upon
ht
Hello,
This came up before but - if we were to make a swear word filter, string
edit distances are no good. for example words like `shot` is confused with
`shit`. there is also problem with words like hitchcock. appearently i need
something like soundex or double metaphone. the thing is - these are
Hello Jason,
I have been trying to do this for a long time on my own. keep up the good
work.
What I tried was a document cache using apache collections. and before a
indexwrite/delete i would sync the cache with index.
I am waiting for lucene 2.4 to proceed. (query by delete)
Best.
On Wed, Sep
ver shingles?
Best,
On Thu, Sep 4, 2008 at 4:12 PM, Karl Wettin <[EMAIL PROTECTED]> wrote:
>
> 4 sep 2008 kl. 14.38 skrev Cam Bazz:
>
>
> Hello,
>> This came up before but - if we were to make a swear word filter, string
>> edit distances are no good. for exampl
at 5:02 PM, Karl Wettin <[EMAIL PROTECTED]> wrote:
>
> 4 sep 2008 kl. 15.54 skrev Cam Bazz:
>
> yes, I already have a system for users reporting words. they fall on an
>> operator screen and if operator approves, or if 3 other people marked it
>> as
>> curse, t
hello,
I was reading the performance optimization guides then I found :
writer.setRAMBufferSizeMB()
combined with: writer.setMaxBufferedDocs(IndexWriter.DISABLE_AUTO_FLUSH);
this can be used to flush automatically so if the ram buffer size is over a
certain limit it will flush.
now the question:
hello,
anyone using ramdisks for storage? there is ramsam and there is also fusion
io. but they are kinda expensive. any other alternatives I wonder?
Best.
> On Thu, 2008-09-04 at 17:58 +0200, Cam Bazz wrote:
> > anyone using ramdisks for storage? there is ramsam and there is also
> fusion
> > io. but they are kinda expensive. any other alternatives I wonder?
>
> We've done some comparisons of RAM (Lucene RAM
Hello,
Lets say we have different document types, and one type of document
only contains field A.
How can I make a query so that I get all the documents that only has field A?
There is a get all documents query, but that would get all the
documents whether they contain field A or not.
Is there
[EMAIL PROTECTED]> wrote:
> I usually do:
> cd
> patch -p 0 -i
>
> See also the HowToContribute page on the wiki.
>
>
> On Sep 15, 2008, at 7:38 AM, Cam Bazz wrote:
>
>> Hello,
>>
>> To patch for lucene-1314 what must I do?
>>
>>
Hello,
I have been looking at instantiated index in the trunk. Does this come
with a searcher? Are the adds reflected directly to the index?
Or is it just an experimental thing only with reader and writer?
Best.
-
To unsubscrib
Hello,
I see that IndexWriter.flush() is depreciated in 2.4. What do we use?
Also I used to make a:
try {
nodeWriter = new IndexWriter(nodeDir, true, analyzer, false);
} catch(FileNotFoundException e) {
nodeWriter = new IndexWriter(nodeDir, true, analyzer,
Hello,
What is the difference between flush in <2.4 and commit?
Also I have been looking over docs, and they mention commit(long) but
there is no commit(long) method but only commit()
Best.
-
To unsubscribe, e-mail: [EMAIL PROT
Hello,
What is the new favorable way of searching a query?
I understand Hits will be depreciated. So how do we do it the new way?
With hit collector?
Best.
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-
Hello,
I would like to get advantage of isDeleted. If I delete a document
from index, and not commit, and index searcher is not reinstantiated,
how can I check if a document is marked for deletion? I tried it with
both commit() and without committing, the isDeleted(mydeleteddocid)
returns always f
g) is a
> private method that should never have been in the javadocs. Thanks for
> raising this!
>
> Mike
>
> Cam Bazz wrote:
>
>> Hello,
>>
>> What is the difference between flush in <2.4 and commit?
>>
>> Also I have been looking over docs, a
Hello,
Here is what I am trying to do:
dir = FSDirectory.getDirectory("/test");
writer = new IndexWriter(dir, analyzer, true, new
IndexWriter.MaxFieldLength(2));
writer.setMaxBufferedDocs(IndexWriter.DISABLE_AUTO_FLUSH);
Document da = new Document();
da.ad
certain criteria.
Best.
On Mon, Sep 15, 2008 at 10:05 PM, Michael McCandless
<[EMAIL PROTECTED]> wrote:
>
> Cam Bazz wrote:
>
>> Hello,
>>
>> I see that IndexWriter.flush() is depreciated in 2.4. What do we use?
>
> Looks like you already found it, but the j
.
On Mon, Sep 15, 2008 at 10:20 PM, Michael McCandless
<[EMAIL PROTECTED]> wrote:
>
> You'll have to open a new IndexReader after the delete is committed.
>
> An IndexReader (or IndexSearcher) only searches the point-in-time snapshot
> of the index as of when it was ope
well, I did not understand here.
so there is a no way of using the new constructor - and specify
autoCommit = false ?
Best
On Mon, Sep 15, 2008 at 10:30 PM, Michael McCandless
<[EMAIL PROTECTED]> wrote:
>
> Cam Bazz wrote:
>
>> However the documentation states that autoCom
t;> still in the OS's write cache when it crashed.
>>
>> But the guarantee only holds if the underlying storage system is "honest"
>> about fsync(), ie, it truly flushes all written bytes for that file to disk
>> before returning.
>>
>> Mike
>>
out of curiousity and somewhat unrelated to this thread. when can we
expect to see 2.4?
it seems much much as changed. so people would want to port their code?
Best.
On Mon, Sep 15, 2008 at 10:56 PM, Michael McCandless
<[EMAIL PROTECTED]> wrote:
>
> Cam Bazz wrote:
>
>
buffered deletes down to docID. Those deletes
> that are against existing segments in the index will be flushed at that
> point to those segments; the deletes that apply only to buffered docs will
> be held in RAM and used by the RAMIndexSearcher that searches IndexWriter's
> buff
n Mon, Sep 15, 2008 at 11:09 PM, Michael McCandless
<[EMAIL PROTECTED]> wrote:
>
> It will return true if the provided docID was deleted, by term or query or
> docID (due to exception, privately) prior to when you asked IndexWriter to
> give you a "realtime" IndexReader.
>
Hello Karl;
This is good good good news. It works.
However, I added a document like
doc.add(new Field("f", "a", Field.Store.YES,
Field.Index.NOT_ANALYZED_NO_NORMS));
and then searched. The score is 0.3~ for the found document. should
not it be 1.0?
also it will find when searched for "f","b" o
Hello,
What kind of query is best to warm up a searcher? How many searches should I do?
Are we supposed to search for things we know do exist, or is it better
to make queries we know they dont exist?
Best.
-C.B.
-
To unsubscrib
Hello,
Could it harm if I make a
searcher.search(query, Integer.MAX_VALUE) ?
I just need to make a query to get the number of hits in this case,
but I dont know what the max hits will be.
Also When I make a topdocs.totalHits is that same as topdocs.scoreDocs.length()?
Best.
-C.A.
---
Yes, I looked into implementing a custom collector that would return
number of hits, but - I could not.
collect() can not access anything that is final, and final can not be
incremented.
Any ideas?
Best.
On Tue, Sep 16, 2008 at 6:05 AM, Daniel Noll <[EMAIL PROTECTED]> wrote:
> Cam B
In cases where we dont know the possible number of hits -- and wanting
to test the new 2.4 way of doing things,
could I use custom hitcollectors for everything? any performance
penalty for this?
from what I understand both TopDocCollector and TopDocs will try to
allocate an array of Integer.MAX_V
Otis
> --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>
>
>
> - Original Message
>> From: Cam Bazz <[EMAIL PROTECTED]>
>> To: java-user@lucene.apache.org
>> Sent: Monday, September 15, 2008 11:25:39 PM
>> Subject: Re: TopDocs
I noticed this was because I was using a KeywordAnalyzer.
Is it possible to write a document with different analyzers in different fields?
Best.
On Tue, Sep 16, 2008 at 8:33 AM, Cam Bazz <[EMAIL PROTECTED]> wrote:
> Hello,
>
> Lets say I have two documents, both containing field
Hello,
Lets say I have two documents, both containing field F.
document 0 has the string "a b" as F
document 1 has the string "b a" as F
I am trying to make a phrasequery like:
PhraseQuery pq = new PhraseQuery();
pq.add(new Term("F", "a"));
pq.add(new Term("F", "b"));
And how about queries that need starting position, like hits between
100 and 200?
could we pass something to the collector that will count between 0 to
100 and then get the next 100 records?
Best.
On Wed, Sep 17, 2008 at 5:16 PM, Erick Erickson <[EMAIL PROTECTED]> wrote:
> Doesn't TopDocCollecto
CTED]> wrote:
> Are the terms stopwords?
>
>
> Otis
> --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>
>
>
> - Original Message
>> From: Cam Bazz <[EMAIL PROTECTED]>
>> To: java-user@lucene.apache.org
>> Sent: Tuesday
fusionio.com has the SSD killer. not that expensive neither. just
twice or triple the ssd.
Best.
On Tue, Sep 16, 2008 at 2:16 PM, Karl Wettin <[EMAIL PROTECTED]> wrote:
> Related, I've been considering filesystem based filters on SSD. That ought
> to be rather fast, consume no memory and be as si
one moment:
the top doc collector is based on some sort of queue, I assume. What
kind of queue is that? does it sort based on score, or whichever doc
comes first.
best.
On Wed, Sep 17, 2008 at 9:43 PM, Chris Hostetter
<[EMAIL PROTECTED]> wrote:
>
> : Well, it turns out the theoretical maximum f
Has anyone tried to implement a triplet store with lucene?
Best,
-C.B.
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
for instance one described in:
http://www.w3.org/2001/sw/Europe/events/20031113-storage/positions/rusher.html
On Mon, Sep 29, 2008 at 4:04 PM, Jason Rutherglen
<[EMAIL PROTECTED]> wrote:
> What is that?
>
> On Mon, Sep 29, 2008 at 8:51 AM, Cam Bazz <[EMAIL PROTECTED]> wrote
How can we get on to that list?
Best,
On Mon, Oct 20, 2008 at 1:58 AM, Hasan Diwan <[EMAIL PROTECTED]> wrote:
> 2008/10/19 Mark Miller <[EMAIL PROTECTED]>:
>> You might instead limit your email to those that have agreed to be contacted
>> at http://wiki.apache.org/lucene-java/Support
>
> FWIW, th
Hello,
I am having a problem qith PrefixQuery:
I have a field name item title which is indexed as:
doc.add(new Field("item_title", item_title.trim().toLowerCase(),
Field.Store.YES, Field.Index.TOKENIZED, Field.TermVector.YES));
and I am forming my query like:
PrefixQuery pq = new PrefixQuery((
Hello;
I like to use lucene as a graph store. The graph representation is a list of
edges. Consider the code below:
final int commitCount = 16 * 1024;
final int numObj = 1024 * 1024;
Analyzer analyzer = new KeywordAnalyzer();
FSDirectory directory = FSDirectory.g
does bring some terms, etc. into
> memory, and you may have a look at the FieldCache.
>
> -Grant
>
> On Jan 15, 2008, at 7:17 AM, Cam Bazz wrote:
>
> > Hello;
> >
> > I like to use lucene as a graph store. The graph representation is a
> > list of
> >
Hello;
Has the IndexWriter.DISABLE_AUTO_FLUSH been depreceated?
I am using lucene core 2.2.0 and although it is in the documentation I can
not access IndexWriter.DISABLE_AUTO_FLUSH
Best,
C.B.
Hello,
I have been running some experiments on lucene. To speed up index time, I
have disabled autocommit,
and I flush the indexwriter each 512 objects. So far I have tried with
256,512,1024,and 2048 and I have seen a really incredible speed difference
indexing.
However, if I the time required to
query.
Best.
On Jan 15, 2008 6:22 PM, Otis Gospodnetic <[EMAIL PROTECTED]>
wrote:
> Hi,
>
>
> - Original Message ----
> From: Cam Bazz <[EMAIL PROTECTED]>
> To: java-user@lucene.apache.org
> Sent: Tuesday, January 15, 2008 8:50:07 AM
> Subject: Re: luc
number of edges (degrees?) between any 2
> nodes?
>
> Otis
> --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>
> ----- Original Message
> From: Cam Bazz <[EMAIL PROTECTED]>
> To: java-user@lucene.apache.org
> Sent: Tuesday, January 15, 2008 11:34:20
Hello,
When storing fields to serve as id's - is it better to use
NumberTools.longToString(id) or just store the id as a field?
I have noticed when using NumberTools to store number as a string, this
makes range queries easier, however - you end up storing a long string.
Considering millions of id
Hello,
I understand after writing some documents in an index with an indexwriter,
the IndexSearcher object has to be reinstantiated for it to find newly
instantiated objects. And this reinstantiation of IndexSearcher is costly
from what I understand.
I am working on a caching scheme that will allo
Hello,
How do I delete a specific document from an indexwriter? I understand there
is deleteDocuments(term) which deletes all the documents matching the term.
But what if I want to delete a document that has more then one term in
specific. I can search the document with a boolean query, and then g
27;d like to
> make this option available someday in IndexWriter, but doing so now
> (when there is no way to get a "reliable" docID) seems too dangerous...
>
> Mike
>
> Cam Bazz wrote:
>
> > Hello,
> >
> > How do I delete a specific document from an i
Hello,
When we delete documents from index - will it autoflush when count of
deleted documents reach a certain value. I am controlling my own flush
operation, and I have disabled autoflush by:
writer.setMaxBufferedDocs(IndexWriter.DISABLE_AUTO_FLUSH);
But I have taken a peek at the IndexWriter
gt; You can also use Solr, which provides "delete by query".
>
> Mike
>
> Cam Bazz wrote:
>
> > Hello Mike;
> >
> > How about deleting by a compount term?
> >
> > for example if I have a document with two fields srcId and dstId
> > and I wa
the source to lucene made makes me think of extensions. Nice
code.
Best,
On Jan 21, 2008 4:47 PM, Michael McCandless <[EMAIL PROTECTED]>
wrote:
>
> Cam Bazz wrote:
>
> > Hello,
> >
> > When we delete documents from index - will it autoflush when count of
>
using a reader, it will acquire the write.lock,
> which will fail if you have another writer open on that index).
>
> Mike
>
> Cam Bazz wrote:
>
> > Hello Michael;
> >
> > how can I construct a chain where both reader and writer at the
> > same state?
> &g
t;
> Do you have a specific use case in mind here? I think we'd like to
> make this option available someday in IndexWriter, but doing so now
> (when there is no way to get a "reliable" docID) seems too dangerous...
>
> Mike
>
> Cam Bazz wrote:
>
> > Hel
Hello,
Could someone show me a concrete example of how to use HitCollector?
I have documents which have a field category. When I run a query, I need to
sort results by category as well as count how many hits are there for a
given category.
I understand:
searcher.search(Query, new HitCollector()
Does anyone have any idea about the error I got while indexing?
Best Regards,
-C.B.
Exception in thread "main" java.io.IOException: background merge hit
exception: _kq:C962870 _kr:C2591 into _ks [optimize]
at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:1749)
at org.apach
<[EMAIL PROTECTED]>
wrote:
>
> That means that one of the merges, which run in the background by
> default with 2.3, hit an unhandled exception.
>
> Did you see another exception logged / printed to stderr before this
> one?
>
> Mike
>
> Cam Bazz wrote:
>
> >
Hello,
How do we get the TermEnum trick? I could not figure it out. basically, I
have a field called category, and I like to learn what different values the
category field takes. (sort of like unique in sql)
Best Regards,
-C.B.
5 PM, Erick Erickson <[EMAIL PROTECTED]> wrote:
> Can you show us what you've tried?
>
> Erick
>
> On Jan 25, 2008 10:49 AM, Cam Bazz <[EMAIL PROTECTED]> wrote:
>
> > Hello,
> >
> > How about getting which documents have the that term as a bitset?
}
> list.add(term.text());
> } while (theTerms.next());
>
>
> On Jan 25, 2008 10:24 AM, Cam Bazz <[EMAIL PROTECTED]> wrote:
>
> > Hello,
> >
> > How do we get the TermEnum trick? I could not figure it out. basically,
>
Hello,
Is IndexSearcher ThreadSafe? I made a simple httpserver using grizzly as
described in
http://jlorenzen.blogspot.com/2007/06/using-grizzly-to-create-simple-http.html
which submit queries to a single instance of indexsearcher and I get some
errors (when I query with more then one threads) suc
Hello,
How can I use a hit collector and sort object in query? I looked at the API
and sort is only usable with hits. Is it even possible? since hitcollector
returns a bitset - how do we do the ordering?
Best,
-C.B.
Hello;
If no document is ever deleted nor updated from an index, will the document
id change? under which circumstances will the document ids change, apart
from delete?
Best Regards,
-C.B.
Hello,
When using a parallel reader with two indexes lets say, when we call a
document with id,
is it the combined fields of a document from the two indexes that return?
The documentation was not clear on that one, except the document(int n,
FieldSelector fs) method.
Best,
-C.B.
Hello,
I have read the parallel reader doc. It says it must have the same number of
documents as the other index.
When we are using a writer - searcher combination, how can we integrate this
parallel reader into game.
Simply, I have some documents, and I just like to mark them, in an efficient
wa
Hello,
Regarding https://issues.apache.org/jira/browse/LUCENE-1026 , this seems
very interesting. I have read the discussion on the page, but I could not
figure out which set of files is the latest.
Is it the IndexAccessor-1.26.2008.zip file?
I will read through the code, make my own tests, and s
ent, and an app that adds docs will be a bit more responsiveeg it
> wont hang as Readers are being reopened.
>
> I also have to bring the AccessProvider classes back. No easy way to use
> your own custom Readers without it...I shouldn't have stripped it out.
>
> - Mark
&
t; a finally block. Batch load multiple docs, but if your just randomly
> adding
> a doc, get the Writer, add it, and then release the Writer in a finally
> block. If you are batch loading a million docs and you want to be able
> to see them
> as they are added: get the writer and add
Hello;
I am trying to make a product matcher based on lucene's ngram based suggest.
I did some changes so that instead of giving the speller a dictionary I feed
it with a List.
For example lets say I have "HP NC4400 EY605EA CORE 2 DUO T5600
1.83GHz/512MB/80GB/12.1''
NOTEBOOK"
and I index it with
e you
> add more terms than what exists, it won't find anything.
>
> On Feb 13, 2008 6:54 PM, Cam Bazz <[EMAIL PROTECTED]> wrote:
>
> > Hello;
> >
> > I am trying to make a product matcher based on lucene's ngram based
> > suggest.
> > I did s
de, then for the first it
> will suggest "abcde" but for the second it won't suggest it because the
> ngrams produced are "abc" and "bce" .. and "bce" does not appear in
> "abcde".
>
> Am I right? If not, can you elaborate more on t
Hello,
I have a tokenized field where I store some info.
Lets say I have "abc 1234" and "abc 678"
When the user searches for "abc1234" how can I find "abc 1234" ?
Best.
-C.B.
Hello Erick,
Has anyone found a way for deleting a document with a query? I understand it
can be deleted via terms, but I need to delete a document with two terms,
that is the only way I can identify my document is by looking at two terms
not one.
best.
On Fri, Mar 14, 2008 at 4:58 PM, Erick Eri
riter.
>
> Mike
>
> Cam Bazz wrote:
>
> > Hello Erick,
> >
> > Has anyone found a way for deleting a document with a query? I
> > understand it
> > can be deleted via terms, but I need to delete a document with two
> > terms,
> > that is the only w
> files in the index to stable storage (assuming your IO system doesn't
> "lie" on fsync).
>
> Mike
>
> On Mar 17, 2008, at 4:33 AM, Cam Bazz wrote:
>
> > Nice. Thanks.
> >
> > will the 2.4 have commit improvements that we previously talked about?
bytes are not
> actually written to stable storage. If you have such a device that
> lies then Lucene 2.4 won't be able to guarantee index consistency on
> crash/power outage.
>
> Mike
>
> Cam Bazz wrote:
>
> > Hello,
> >
> > What do you mean by I
what you mean by "same thread". Maybe you meant "same
> index"?
>
> Yes, if the IndexReader reopens.
>
> IndexWriter.commit() makes the changes visible to readers, and makes
> the changes durable to os/computer crash or power outage.
>
> Mike
>
> Cam
wrote:
>
> It's a hard drive issue. When you call fsync, the OS asks the hard
> drive to sync.
>
> Mike
>
> Cam Bazz wrote:
>
> > Hello,
> >
> > I understand the issue. But I have not understood - is this
> > hardware related
> > issue - i.e a
IndexReader. IndexReader
> still searches only a point in time.
>
> Mike
>
> Cam Bazz wrote:
>
> > yes, I meant the same index.
> >
> > I thought with the new changes - the index reader would see the
> > changes
> > without re-opening.
> > It wo
Hello,
I recently changed my query logic. Before, I was getting a hits object, and
now I am using a bitSet with a hitcollector.
The reason for using bitSet is document caching, and being able to count how
many hits belong to which categories.
Although my new logic works, I have noticed that now t
Hello,
I am querying an index by using custom boost factors for each field. Usually
a query looks like:
fieldA:"term1"^0.2 fieldB:"term2"^4
when I get scores from HitCollector, they are not necessarily between 0 and
1.
How can I normalize these scores?
Best.
-C.A.
Hello All,
Any suggestions for extracting text from PDF? I have tried pdfbox, but it
works nice, however if the pdf is structured, it wont provide good results.
For example consider the pdf:
P1 Lorem Ipsum Bla bla P3 Lorem2 Ipsum2
P1 bla bla
P2 bla bla bla
P
Hello Bill,
Problem I am having is that some of them has multiple columns. and multiple
word boxes. Does the xpdf patch extract different columns and wordboxes?
Best,
-C.B.
On Wed, May 14, 2008 at 6:35 PM, Bill Janssen <[EMAIL PROTECTED]> wrote:
> > > the unix program pdf2text can convert keep
Hello,
little off topic, but how did you obtain the pagerank for each page. did you
calculate it, or have you obtained it with some other way while getting a
specific site.
Best.
On Thu, May 29, 2008 at 3:28 PM, 过佳 <[EMAIL PROTECTED]> wrote:
> thanks Glen , we have tried it , but the bottleneck
Hello,
When you look at the fields of a document with Luke, there is a norm column.
I have not been able to figure out what that is.
The reason I am asking is that I am trying to build a uniqueness model. My
Index is structured as follows:
classID, textID, K, V
classID is a given class. textID
yes, figured it out. thanks.
how about checking for uniqueness?
Best.
On Wed, Jun 11, 2008 at 5:39 PM, Karl Wettin <[EMAIL PROTECTED]> wrote:
>
> 11 jun 2008 kl. 16.04 skrev Cam Bazz:
>
>>
>> When you look at the fields of a document with Luke, there is a norm
&g
Hello,
Imagine I have the following documents having keys
A
A>B
A>B>C
A>B>D
A>B>C>D
now Imagine a query with keyword analyzer and a wildcard: A>B>*
which will bring me A>B>C , A>B>D and A>B>C>D
but I just want to get A>B>C and A>B>D
so can I make a query like A>B>* but does not have the > cha
ote:
> I assume you want all of your queries to function in this way?
>
> If so, you could just translate the * character into a ? at search time,
> which should give you the functionality you are asking for.
>
> Unless I'm missing something.
>
> Matt
>
>
Hello,
I need to be able to select a random word out of all the words in my index.
how can I do this tru termDocs() ?
Also, I need to get a list of unique words as well. Is there a way to ask
this to lucene?
Best Regards,
-C.B.
Hello,
Is it possible to make a boolean query where a word is equal to fieldA or
fieldB?
in other words, I like to search a word in two fields, if word passes in
fieldA or fieldB, then it is a hit.
Best,
-C.B.
hello,
was not there a lucene delete by query feature coming up? I remember
something like that, but I could not find an references.
best regards,
-c.b.
1 - 100 of 118 matches
Mail list logo