This has nothing to do with Lucene, but as I have written something very
similar I'm taking the bait. You're best of using XPath or similar XML/HTML
query language to parse the product specs, prices or whatever you're after.
Each webshop you're indexing will have its own set of query expressions fo
http://hudson.zones.apache.org/hudson/job/Lucene-trunk/javadoc//org/apache/lucene/search/Similarity.html
On Sat, Jun 28, 2008 at 2:16 AM, Maha Khairy <[EMAIL PROTECTED]>
wrote:
>
>
> I wanted to know how the ranking work in Lucene and if it is only according
> to the frequency or there is any oth
);
> directory.makeLock(IndexWriter.WRITE_LOCK_NAME).isLocked()
>
> or
> Directory directory = FSDirectory.getDirectory(indexDir);
> IndexReader.isLocked(directory)
>
> many thanks,
> David
>
>
> Fredrik Andersson-2 wrote:
> >
> > What you suggested is
What you suggested is generally the most easygoing way to deal with
it, i.ehaving a separate index per writer and one serial merging
process. I have
dabbled with disabling (file system) locks and synchronizing the writing
processes by different means, but it's failure-prone unless you're very
famil
This is functionality you would build upon Lucene. I.e, when a document is
dropped to the indexing module, you check the category and append the
document to an appropriate index. You can then search multiple indices with
one searcher or with multiple searchers. Also, Lucene 1.4 is kinda old...
wel
Hi gang!
If you do a multiterm query to Lucene, say "foo bar zoo", it gives you a
heap of documents (Hits) as a result and all is well. If you want some
tracking abilities to this query, for instance you want to know that
document X was included in the Hits because "foo" and "bar" matched but
"zo
If you want to create an index, you have to supply the true as the last
constructor argument to IndexWriter. The lock files use some kind of hash
for their ID:s and might very well persist even if you delete the directory.
So, delete new directory (if it ever was created), delete any lockfiles,
ch
Hossman, thank you! Exactly what I was looking for. And I know the
application of "locks", it's just a little peculiar situation right now
which requires this... "fix" : )
Fredrik
On 8/31/06, Chris Hostetter <[EMAIL PROTECTED]> wrote:
: Even if it's very briefly whilst opening the index - a w
Project Management Committee
On 8/31/06, Leimbach, Johannes <[EMAIL PROTECTED]> wrote:
Please excuse my stupid question - but what is a PMC?
-Ursprüngliche Nachricht-
Von: Fredrik Andersson [mailto:[EMAIL PROTECTED]
Gesendet: Donnerstag, 31. August 2006 09:56
An: g
Congrats, well deserved!
On 8/31/06, Peter Keegan <[EMAIL PROTECTED]> wrote:
Wow, from the contributions I've seen, I thought he was already a member.
Congratulations, Yonik!
Peter
On 8/30/06, Doug Cutting <[EMAIL PROTECTED]> wrote:
>
> The Lucene PMC has voted to add Yonik Seeley to its ran
ased and the IndexReader ha
no further need to lock the index unless you attempt a delete.
: Date: Wed, 30 Aug 2006 17:31:32 +0200
: From: Fredrik Andersson <[EMAIL PROTECTED]>
: Reply-To: general@lucene.apache.org
: To: general@lucene.apache.org
: Subject: Forcing an IndexReader to read-only
Hi guys!
I don't know if I've missed some crucial feature here, but how d'you
actually force an IndexSearcher (and hence, the underlying IndexReader) to
go read-only? The default behaviour now seems to be that the first one to
acquire a lock automatically gets a read/write-lock, instead of leavin
Hey guys.
4Gb of RAM for an index of 2 million documents should really not be a
problem. You should consider separating the index from the actual content (
i.e, only save the index data in your index, not the html), if you have the
possibility to do that. I am not very comfortable with the very c
Yeah, I meant field matches ofcourse. Well ok, I'll check out the Keyword
Analyzer and give it a go, thanks!
On 5/30/06, Chris Hostetter <[EMAIL PROTECTED]> wrote:
: I'd like to know if there's a way to force a query to return only exact
: hits, not partials/subsets. For instance:
: A query "fo
Hey guys!
I'd like to know if there's a way to force a query to return only exact
hits, not partials/subsets. For instance:
A query "foo bar" will match on a field with "foo bar zoo". This I would
like to avoid as I'm need of removing duplicates on certain fields. Two
options considered this far,
Ok, I figured you had some setup like that.
Personally, I would prefer one large index. The overhead associated with
opening/closing/managing thousands of searchers/modifiers is much bigger
than to incorporate the personal restriction in the query. Also, you risk
running out of filepointers, depe
If the users only should have access to search their own documents, it would
probably make sense to keep their respective index locally. Besides greater
query speed, it would also simplify things when updating/appending the
index. So, that would mean one index, one IndexModifier and one
IndexSearc
Hi guys.
Short question: Can a Lucene index (v1.9) be moved from a 32-bit Linux
platform to a 64-bit Linux platform without breaking it?
Thanks,
Fredrik
Hi James,
I can't speak for anyone else, but my experience is that the general
approach is to first select a subset based on the angle between the query
vector and the document vector, in their non-reduced forms (this is a normal
search-for-keyword, what Lucene does by default, in vector notation)
[EMAIL PROTECTED]
> Sent: Tuesday, January 24, 2006 4:04 PM
> To: general@lucene.apache.org
> Subject: RE: updating Lucene Index
>
> even if you use IndexModifier class,
> you should delete then addDoc the document to be updated.
>
> Thanks,
>
> Koji
>
> > -Orig
Hi,
Use the IndexModifier class?
On 1/24/06, Kodumuri, Madhavi <[EMAIL PROTECTED]> wrote:
>
> Hi,
>
> My Lucene Indexer indexes from scratch with no problem. But I would like
> to update the index database next time I run Indexer rather than
> deleting the database and creating index from scratch
Hey Gang. Problem regarding the termDocs(Term) function, help most
appreciated.
// Create stored, indexed, non-tokenized field
Field field = new Field("someId", someInteger+"", true, true, false);
doc.add(field).
This field looks fine in Luke and can be read properly, however, when I try
to get a
Problem solved, it was a problem located elsewhere in the code not related
to Lucene. Sorry!
Fredrik
On 9/29/05, Fredrik Andersson <[EMAIL PROTECTED]> wrote:
>
> Hey Gang!
>
> I'm having some problems when modifying an existing index, adding a binary
> field
Hey Gang!
I'm having some problems when modifying an existing index, adding a binary
field to each document. Or more specifically, I have a problem reading back
that field. I'm using the IndexModifier from the trunk, and I am positive
that the binary field gets written down, since the field name s
t; You can encode (e.g. base64) the binary data to get a String
> and store the String.
>
> Koji
>
> > -Original Message-
> > From: Fredrik Andersson [mailto:[EMAIL PROTECTED]
> > Sent: Monday, September 26, 2005 6:31 PM
> > To: general@lucene.apache.org
&g
Hello Gang!
Is there any trick, or undocumented way, to store binary (unindexed,
untokenized) data in a Lucene Field? All the Field constructors just deal
with Strings. I'm currently using another database to store binary data, but
it would be very neat, and more efficient, to store it directly in
Hi folks.
I read a transcript from last months digest of this list, in a post by
Rajesh Munavalli, that Lucene uses a VSM retrieval method. In my previous
work with VSM, it has included matching a query vector towards the documents
in the term-document space. I have dissected and customized a l
27 matches
Mail list logo