Re: [Ferret-talk] Category Number Results returned

BlueJay Mon, 10 Jul 2006 13:39:08 -0700

David Balmain wrote:

David

Thanks for your continued help and assistance.

I don't have code at this stage because I started writing it one way and 
realised that the way I was writing it through counts in Ruby would not 
work because of pagination.

A little more background is in order. The user will be presented with a 
pull down menu with 5 selections in a main category. Doing 6 queries 
(one main query) and 5 count queries in this instance is not a problem. 
The problem arises when they select one of these categories.

They will then be presented with up to 5 other category structures. One 
would be new or old, another would be type (up to 5 nodes), another 
would be, for example, book type (such as fiction, no fiction, 
authbiography) etc. (up to 20 categories), another could have up to 40 
categories. The user is free to select any of these category nodes 
because they may be interested in old books and fiction. I will 
therefore have to populate all of the nodes with the number of documents 
in each node. This could leave me with spawing 60 odd queries to count 
the number of documents in each node. Subsequent selections of nodes 
would refine the result set down further.

What I really would like to do is 2 or 3 queries. One which does the 
normal search over the document set (collection) and the second to 
populate each node in the classification structure with the number of 
documents that match each node.

It is pretty easy in 2 queries to tell if there are any documents in 
each node but doing a count over all the nodes is more tricky. I was 
originally going to have another table which had a row for each node 
with the name of the node (and structure) in one field and the 
document_id's in another field. For example, [Fishing, "doc1 doc2 doc3 
doc4"], [Fishing/Fiction, "doc2, doc3"], [Fishing/Non Fiction, "doc 1] 
etc. I would then get a result set that provided all the categories that 
had hits against a given query. However, it does not provide the number 
of documents against each node. So I could not populate the pull down 
categories with Fishing (2), Fiction (1), Non Fiction (1) etc.

Therefore, what I really need is a function that will return the number 
of documents in each node of a given classification structure. An 
addition to the Num_Docs capability already available perhaps.

I could easily produce a results set that would be like this....

Fishing doc1
Fishing doc2
Fishing/Fiction doc3
Fishing/Fiction doc1
Fishing/Non Fiction doc4
etc...

Num_Docs would provide 5 in this instance but what I really want is:
Fishing 2
Fishing/Fiction 2
Fishing/Non Fiction 1
etc...

All that, and done in 1 or 2 queries over and above the original 
search.... Simple eh!

I hope that I have not confused you to much, but this is something that 
I desperately need or my project is kaput!

I found this: 
http://www.mail-archive.com/[email protected]/msg00343.html and

http://www.ruby-forum.com/topic/56232#40931

Do you think that this is the way to go?

Thanks very much.

> On 7/10/06, BlueJay <[EMAIL PROTECTED]> wrote:
>> >     fishing_count = index.search_each("sport AND fishing", :num_docs =>
>> I have several sub categories (taxonomy really) and what I was thinking
>> of doing was this in 2 queries. Index the data as per normal so that you
>> can do the full text search but also index the structure of the taxonomy
>> and have each branch contain the records that contain it.
>> Run one big search over the fulltext to get the list of hits and then
>> use this list as a query against the second index to get all the
>> category bits.
> 
> I'm not sure what you mean by "category bits". Can you possible
> implement the categories like this;
> 
> sport/
> sport/shooting/
> sport/fishing/
> sport/fishing/fly
> sprot/fishing/deep_sea
> etc.
> 
> Then, lets say you have a query in query_str. You can get all results
> in the sport category like this;
> 
>     index.search_each(query_str + "AND category:sport/*") {
>         # ...
>     }
> 
> You can get all results in the fishing category like this;
> 
>     index.search_each(query_str + "AND category:sport/fishing/*") {
>         # ...
>     }
> 
> Am I making sense?
> 
>> This would be a big query though - although it should be quick but I
>> would need to re-index the category bits everytime a document was added.
> 
> You've lost me. Could you give some example code?
> 
>> Does this make sense and/or would it make sense in Ferret. I have done
>> this before in another search engine that required special category
>> manipulation but never with Ferret and not sure how to go about doing
>> this in Ferret.
>>
>> I am not sure about your idea around filtering the results
> 
> I'll explain filtering once I understand better what it is you are 
> trying to do.
> 
> Cheers,
> Dave

-- 
Posted via http://www.ruby-forum.com/.
_______________________________________________
Ferret-talk mailing list
[email protected]
http://rubyforge.org/mailman/listinfo/ferret-talk

Re: [Ferret-talk] Category Number Results returned

Reply via email to