FYI, I have kept this email from 2011 about poor performance of parsed
words in headline generation. If someone wants to research it, please
do so:
http://www.postgresql.org/message-id/1314117620.3700.12.camel@dragflick
On Wed, Aug 15, 2012 at 11:09:18PM +0530, Sushant Sinha wrote:
I will do the profiling and present the results.
Sushant, do you have any profiling results on this issue from August?
---
On Wed, 2012-08-15 at 12:45
Is this a TODO?
---
On Tue, Aug 23, 2011 at 10:31:42PM -0400, Tom Lane wrote:
Sushant Sinha sushant...@gmail.com writes:
Doesn't this force the headline to be taken from the first N words of
the document, independent
This might indicate that the hlCover() item is resolved.
---
On Wed, Aug 24, 2011 at 10:08:11AM +0530, Sushant Sinha wrote:
Actually, this code seems probably flat-out wrong: won't every
successful call of
Bruce Momjian br...@momjian.us writes:
Is this a TODO?
AFAIR nothing's been done about the speed issue, so yes. I didn't
like the idea of creating a user-visible knob when the speed issue
might be fixable with internal algorithm improvements, but we never
followed up on this in either fashion.
I will do the profiling and present the results.
On Wed, 2012-08-15 at 12:45 -0400, Tom Lane wrote:
Bruce Momjian br...@momjian.us writes:
Is this a TODO?
AFAIR nothing's been done about the speed issue, so yes. I didn't
like the idea of creating a user-visible knob when the speed issue
Given a document and a query, the goal of headline generation is to
produce text excerpts in which the query appears. Currently the headline
generation in postgres follows the following steps:
1. Tokenize the documents and obtain the lexemes
2. Decide on lexemes that should be the part of the
Sushant Sinha sushant...@gmail.com writes:
Given a document and a query, the goal of headline generation is to
produce text excerpts in which the query appears.
... right ...
Here is a simple patch that limits the number of words during the
tokenization phase and puts an upper-bound on the
Excerpts from Tom Lane's message of mar ago 23 15:59:18 -0300 2011:
Sushant Sinha sushant...@gmail.com writes:
Given a document and a query, the goal of headline generation is to
produce text excerpts in which the query appears.
... right ...
Here is a simple patch that limits the
Here is a simple patch that limits the number of words during the
tokenization phase and puts an upper-bound on the headline generation.
Doesn't this force the headline to be taken from the first N words of
the document, independent of where the match was? That seems rather
unworkable,
Sushant Sinha sushant...@gmail.com writes:
Doesn't this force the headline to be taken from the first N words of
the document, independent of where the match was? That seems rather
unworkable, or at least unhelpful.
In headline generation function, we don't have any index or knowledge of
Actually, this code seems probably flat-out wrong: won't every
successful call of hlCover() on a given document return exactly the same
q value (end position), namely the last token occurrence in the
document? How is that helpful?
regards, tom lane
There is a line
12 matches
Mail list logo