Hello, I'd like to index a web forum (phpBB) with Lucene. I wonder how to
best map the forum document model (topics and their messages) to the Lucene
document model.
Usually, some forum member creates a new topic with its first message text,
then other members add reply messages to that topic. Me
Le Samedi 13 Janvier 2007 10:49, Melange a écrit :
> Hello, I'd like to index a web forum (phpBB) with Lucene. I wonder how to
> best map the forum document model (topics and their messages) to the Lucene
> document model.
>
> Usually, some forum member creates a new topic with its first message te
Nicolas Lalevée-2 wrote:
>
> Le Samedi 13 Janvier 2007 10:49, Melange a écrit :
>> Hello, I'd like to index a web forum (phpBB) with Lucene. I wonder how to
>> best map the forum document model (topics and their messages) to the
>> Lucene
>> document model.
>>
>> Usually, some forum member crea
Hi there,
I'm having some strange behaviour using the highlighter and I'm wondering if
it is a bug or should I take a different approach ?
I want to highlight the search terms that were used to execute a query. If
the search terms end in an end-bracket or end-square-bracket (so ')' or ']'
), the
Hi!
With a project we want to use Lucene in, we are running into
performance problems with regard to building filter sets.
Let me give you a quick overview of what we need to do:
We are indexing information about users (index magnitude is ranging
between 2 - 10 million documents). Each of th
you say <<>>
Before going off into modifying things, could you expand a bit on how you
query to build up the filter? Perhaps providing a code snippet?
Just to be sure we're talking about the same thing, when you say filter, are
you talking about Lucene filters? I'm assuming you are, in which cas
13 jan 2007 kl. 19.14 skrev Kay Roepke:
All of the users (documents we index) are "connected" to certain
other users,
in a network fashion. We must be able to restrict the query (or
filter it after
searching the complete index) to certain "levels of connectedness",
i.e. you
can search with
Which version are you using? I believe that this is a bug that was fixed
last August...but that the fix is only in the 2.1 Highlighter version.
Try grabbing the latest highlighter code from the trunk.
- Mark
heikki doeleman wrote:
Hi there,
I'm having some strange behaviour using the highlig
Hi Erick!
On 13. Jan 2007, at 19:54 , Erick Erickson wrote:
Before going off into modifying things, could you expand a bit on
how you
query to build up the filter? Perhaps providing a code snippet?
We are passing in our unique ids from our database which we have to
translate
to lucene doc
Hi Karl!
On 13. Jan 2007, at 20:12 , karl wettin wrote:
13 jan 2007 kl. 19.14 skrev Kay Roepke:
All of the users (documents we index) are "connected" to certain
other users,
in a network fashion. We must be able to restrict the query (or
filter it after
searching the complete index) to ce
I can handle situations where this can take long once, since I'm
really asking something
that Lucene isn't designed for, but the culprit is that I can't really
cache the resulting
bitset. I can cache it on one of the Lucene servers, but can't share
it among the rest of
the servers (we will e
On 14. Jan 2007, at 2:40 , Mark Miller wrote:
First, have you looked at SwarmCache? Cluster aware caching for
java...
No, I haven't come across that one. I'll take a look, thanks!
As a matter of fact, we do have a network-wide caching mechanism, so
that's what we use.
Second...does it ma
Sorry Kay, I jumped in midstream...I should have read your first post
more thoroughly. By the way, many of the experts rarely comment much on
the weekend so you will probably get some good answers come Monday (lots
of replies often attract their attention ).
I do have one more whack though:
I
On 14. Jan 2007, at 3:20 , Mark Miller wrote:
Sorry Kay, I jumped in midstream...I should have read your first
post more thoroughly.
No problem, it was a bit lenghty, anyway...sorry about that. I just
tried to give enough information so that people don't get confused
too much.
By the w
A couple of things...
1> You're probably already aware that the indexreader doesn't reflect
updates until it is re-opened, so any filters you cached would be valid
until you re-opened the reader. CachingWrapperFilter will store the Lucene
filters for you. But this probably isn't germane to your p
: So what we want to do is to cache the filters, once created. Since
: the document ids would not be the same across the Lucene
: servers we'll be using, we can only cache the filters per server,
: which is a big performance loss. We also cannot reasonably control
: on which Lucene server the requ
: 4> It's playing with fire, but you say "in essence, we want persistent
: Lucene document numbers". I believe they *are* persistent until and unless
: you optimize *after* deleting documents. So you control when they change
: (you'll get more information by searching the mail archive, but wha
>
> : - To keep the document ids from changing we could prevent segment
> : merging - I'm not concerned with optimizing indices, this can be done
> : offline,
> :and I'm prepared to build the caches after that. What would be the
> : ballpark figure for query time degradation, approximately?
> :
18 matches
Mail list logo