On 4/24/2009 3:16 AM, Doron Cohen wrote:
> On Fri, Apr 24, 2009 at 12:28 AM, Steven Bethard wrote:
>
>> On 4/23/2009 2:08 PM, Marcus Herou wrote:
>>> But perhaps one could use a FieldCache somehow ?
>> Some code snippets that may help. I add the PageRank value as a field of
>> the documents I inde
On Fri, Apr 24, 2009 at 12:28 AM, Steven Bethard wrote:
> On 4/23/2009 2:08 PM, Marcus Herou wrote:
> > But perhaps one could use a FieldCache somehow ?
>
> Some code snippets that may help. I add the PageRank value as a field of
> the documents I index with Lucene like this:
>
>Document docum
Thank you Steve, now it's implementation time...
I'll be back :)
/M
On Fri, Apr 24, 2009 at 3:13 AM, Steven Bethard wrote:
> On 4/23/2009 2:42 PM, Marcus Herou wrote:
> > So what you basically are saying is that:
> >
> > 1. You have an index which contains data that is more or less static (no
>
On 4/23/2009 2:42 PM, Marcus Herou wrote:
> So what you basically are saying is that:
>
> 1. You have an index which contains data that is more or less static (no
> updates) or you have another update interval than the PR interval.
> 2. A PR index which is rebuilt (from scratch ?) every X days/wee
Never mind of how to open the ParallellReader stuff (I am an idiot): RTFM:
http://lucene.apache.org/java/2_4_0/api/org/apache/lucene/index/ParallelReader.html
But the rest is of course interesting :)
/M
On Thu, Apr 23, 2009 at 11:42 PM, Marcus Herou
wrote:
> Thanks! (I started my reply and then
Thanks! (I started my reply and then saw that you added code snippets)
I think we are narrowing down the problem to the updating issue of the
PageRank score.
So what you basically are saying is that:
1. You have an index which contains data that is more or less static (no
updates) or you have an
On 4/23/2009 2:08 PM, Marcus Herou wrote:
> But perhaps one could use a FieldCache somehow ?
Some code snippets that may help. I add the PageRank value as a field of
the documents I index with Lucene like this:
Document document = new Document();
double pageRank = this.pageRanks.getCount(
On 4/23/2009 1:58 PM, Doron Cohen wrote:
>> I think we are doing similar things, at least I am trying to implement
>> document boosting with pagerank. Having issues of howto appky the scoring
>> of
>> specific docs without actually reindex them. I feel something should be
>> done
>> at query time w
But perhaps one could use a FieldCache somehow ?
/M
On Thu, Apr 23, 2009 at 11:07 PM, Marcus Herou
wrote:
> Yes I have considered it for 30 minutes :)
>
> How do one apply that in the real world ?
>
> If the only thing I get access to is the actual docId would it not be
> really expensive to get
Yes I have considered it for 30 minutes :)
How do one apply that in the real world ?
If the only thing I get access to is the actual docId would it not be really
expensive to get the Document itself from the index and later use some field
in it as external lookup in some optimized structure for t
>
> I think we are doing similar things, at least I am trying to implement
> document boosting with pagerank. Having issues of howto appky the scoring
> of
> specific docs without actually reindex them. I feel something should be
> done
> at query time which looks at external data but do not know h
Hi.
I think we are doing similar things, at least I am trying to implement
document boosting with pagerank. Having issues of howto appky the scoring of
specific docs without actually reindex them. I feel something should be done
at query time which looks at external data but do not know howto impl
On 4/10/2009 5:13 PM, Steven Bethard wrote:
> On 4/10/2009 12:56 PM, Steven Bethard wrote:
>> I need to have a scoring model of the form:
>>
>> s1(d, q)^a1 * s2(d, q)^a2 * ... * sN(d, q)^aN
>>
>> where "d" is a document, "q" is a query, "sK" is a scoring function, and
>> "aK" is the exponential
On 4/10/2009 12:56 PM, Steven Bethard wrote:
> I need to have a scoring model of the form:
>
> s1(d, q)^a1 * s2(d, q)^a2 * ... * sN(d, q)^aN
>
> where "d" is a document, "q" is a query, "sK" is a scoring function, and
> "aK" is the exponential boost factor for that scoring function. As a
> si
On 4/10/2009 1:08 PM, Jack Stahl wrote:
> Perhaps you'd find it easier to implement the equivalent:
>
> log(s1(d, q))*a1 + ... + log(sN(d, q))*aN
Yes, that's fine too - that's actually what I'd be optimizing anyway.
But how would I do that? If I took the query boost route, how do I get a
TermQue
Perhaps you'd find it easier to implement the equivalent:
log(s1(d, q))*a1 + ... + log(sN(d, q))*aN
On Fri, Apr 10, 2009 at 12:56 PM, Steven Bethard wrote:
> I need to have a scoring model of the form:
>
>s1(d, q)^a1 * s2(d, q)^a2 * ... * sN(d, q)^aN
>
> where "d" is a document, "q" is a que
16 matches
Mail list logo