Hey Li,
Thanks - great answer, exactly touched on the points I was interested in.
One last Q - Once you did tweak it to work in a 'top K' way,what was
performance impact like?
I've written similar components in the past that iterate over top result set
docs (on the order of 400-600 top results)
On Tue, Sep 28, 2010 at 8:14 PM, Li Li fancye...@gmail.com wrote:
I think current implmetation is slow. because it do collapse in all
the hit docs. In my view, it will take more than 1s when using
collapse and only 200ms-300ms when not in our environment. So we
modify it as -- when user need
hey guys,
Any word on this? has anyone did any benchmarking / used this in
production-like environment?
We are considering using this feature on a large scale for deduplication and
was wondering
if anyone has some numbers before I go ahead and start my own series of
tests...
thanks,
Chak
I think current implmetation is slow. because it do collapse in all
the hit docs. In my view, it will take more than 1s when using
collapse and only 200ms-300ms when not in our environment. So we
modify it as -- when user need top 100 docs, we collect top 200 docs
and do collapse within these 200