I'm using nutch 1.0.
My subcollections.xml config file is configured like this:

<?xml version="1.0" encoding="UTF-8"?>
<subcollections>
<subcollection>
        <name>sub1</name>
        <id>sub1</id>
                <whitelist>
                        http://www.apache.org/
                </whitelist>
                <blacklist />
</subcollection>
<subcollection>
        <name>sub2</name>
                <id>sub2</id>
                <whitelist>
                        http://www.mysql.com/
                </whitelist>
                <blacklist />
</subcollection>
<subcollection>
        <name>sub3</name>
                <id>sub3</id>
                <whitelist>
                        http://www.redhat.com/
                </whitelist>
                <blacklist />
</subcollection>
</subcollections>


After indexing, and making sure that plugin subcollection was activated on nutch-site.xml,
I checked the database with luke.
Subcollection field was populated as it should with sub1,sub2,sub3
Problem is when I try to search for anything associated with a subcollection.
I get zero results (on luke).
Using the command line, the same results:
./bin/nutch org.apache.nutch.searcher.NutchBean "subcollection:sub1 apache"
Total hits: 0
After performing a normal search, following the explain link on the search results, the subcollection content is correct too but any search using subcollection:sub1 text, returns no results..
Bug maybe?


--
AVISO DE CONFIDENCIALIDADE: Esta mensagem, assim como os ficheiros

AVISO DE CONFIDENCIALIDADE: Esta mensagem, assim como os ficheiros eventualmente anexos, é confidencial e reservada apenas ao conhecimento da(s) pessoa(s) nela indicada(s) como destinatária(s). Se não é o seu destinatário, solicitamos que não faça qualquer uso do respectivo conteúdo e proceda à sua destruição, notificando o remetente.

LIMITAÇÃO DE RESPONSABILIDADE: A segurança da transmissão de informação por via electrónica não pode ser garantida pelo remetente, que consequentemente, não se responsabiliza por qualquer facto susceptível de afectar a sua integridade.
CONFIDENTIALITY NOTICE: This message, as well as any existing attached files, is confidential and intended exclusively for the individual(s) named as addressees. If you are not the intended recipient, you are kindly requested not to make any use whatsoever of its contents and to proceed to the destruction of the message, thereby notifying the sender.
DISCLAIMER: The sender of this message can NOT ensure the security of its electronic transmission and consequently does not accept liability for any fact, which may interfere with the integrity of its content.



Reply via email to