The problem occurs when a workers sees that no GIs apply to it, and then 
decides that it should do no GI filtering. Nodes which match some GIs, stil 
respect the filter. The problem occurs in the ncbi toolbox, not the pure 
mpiBLAST code, since it assumes that running on a single node with a filter 
that matches none of the GIs is illogical. 

Please confirm that your runs produce this message on STDERR:
BlastCreateVirtualOIDList: Missing oidlists to attach virtual_oidlist

If so the problem occurs in ncbi/tools/blastool.c at line 2727
    /* attach mask to appropriate place */
    if (oidlist_forall_rdfp && real_ngis > 0) {
        rdfp_chain->oidlist = virtual_oidlist;
    } else {
        if (mask_rdfp)
            mask_rdfp->oidlist = virtual_oidlist;
        else {
            /* Should never happen */
            ErrPostEx(SEV_ERROR, 0, 0, "BlastCreateVirtualOIDList: Missing "
                    "oidlists to attach virtual_oidlist");
            OIDListFree(virtual_oidlist);
            return NULL;
        }
    }



Unfortunately, simply replacing the  /* Should never happen */ block with 
another
             mask_rdfp->oidlist = virtual_oidlist;
  
or 
             mask_rdfp->oidlist = NULL;

seems to result in crashes when there are no GIs for a particular worker.


Presumably the worker can be told to look for this case, and return a correct 
"No results found" response to the master. Doing so, is a bit beyond my 
abilities tonight, and before I attempt anything of the sort I'd welcome input 
from others. While the problem is most visible in Jerome's extreme case, this 
could be an issue any time the filter removes more GIs than are present on a 
single worker. That pathological case is certainly rare, but not impossible.
           



 
--
Mike Cariaso * Bioinformatics Software * http://www.cariaso.com

----- Original Message ----
From: Mike Cariaso <[EMAIL PROTECTED]>
To: [email protected]
Sent: Wednesday, March 14, 2007 8:57:46 PM
Subject: Re: [Mpiblast-users] Strange issue in mpiBlast limiting by "Gi"


By having so few GIs, it becomes likely that at least one of the workers is 
given 0 sequences to analyze. This seems like a rather special case, and I 
would imagine it is where the bug hunt should begin. I'll try to do a few tests 
this evening.

 
--
Mike Cariaso * Bioinformatics Software * http://www.cariaso.com

----- Original Message ----
From: Jerome <[EMAIL PROTECTED]>
To: [email protected]
Sent: Wednesday, March 14, 2007 8:43:23 PM
Subject: Re: [Mpiblast-users] Strange issue in mpiBlast limiting by "Gi"

Hi
 Mike,
Yes, i understand our remark, and you are rigth in the way that the 
number of sequence's reference is very big.
But in my special case, the end user are using genome sequence for 
bacteria, that have just one reference (or two for ribozomal ciclic 
sequence) for all of the genome. And, searching for just two or three 
special bacterias, i found this issue.
With the other databases as nt, and i've never use filter of 2 or 3 Gi.
Take my question more like a desie to understand why this problem.
Best regards.

Mike Cariaso wrote:
> Jerome,
> 
> What you are describing may be true (if so, its worth pursing), but my 
> your usage sounds very strange to me.
> 
> Having 7 fragments would be reasonable for a small cluster with 7 nodes. 
> But for your GI filter file to only have 7 lines, would mean that you 
> would only be searching 7 sequences. When I search a database such as 
> NCBI's
 'nt', my filterfile usually has more than 4,000,000 GIs.
> 
> The GI filter is used to include/exclude individual sequences when 
> searching your blastable database. This is done with the "-l 
> filterfilename "(lowercase L) command line switch, where filterfilename 
> is (usually) a text file of GI #s, one per line. Typically a blastable 
> database will have thousands to millions of sequences, and therefore 
> thousands to millions of GIs.
> 
> The number of fragments refers to the number of pieces you split your 
> blastable database into. Commonly this will be the same size as the 
> number of workers in your cluster, or perhaps a multiple such as 2x or 
> 3x larger.
> 
> In most cases the number of sequences in the database is many thousands 
> of times larger than the number of fragments. Otherwise each worker 
> would have very little work to do, and mpiBLAST probably
 wouldn't make 
> much sense.
> 
> Does it sound like you and I are describing the same thing?
> 
> --
> Mike Cariaso * Bioinformatics Software * http://www.cariaso.com
> 
> 
> ----- Original Message ----
> From: Jerome <[EMAIL PROTECTED]>
> To: [email protected]
> Sent: Wednesday, March 14, 2007 7:55:28 PM
> Subject: [Mpiblast-users] Strange issue in mpiBlast limiting by "Gi"
> 
> Hi all,
> ive just instal a Mpiblast to running on our cluster. And it's a verya
> fast program that i can thank's you!
> For justa few day, some user want's to limit the search in a database of
> bacterians genome. And i could notice that, if the number of "gi"
> entries in the filter file is less or equivalent of the number of
> fragments of the initial database, i don't matter about this
 filter file.
> I mean that, for example, if i format the databases in 7 fragments, and
> ask with a filter file size of >7 lines, i don't use it and answer the
> similar response as if i don't use this filter.
> Could someone help me in this issue?
> 
> -- 
> -- Jérôme
> Le vase donne une forme au vide, et la musique au silence.
>     (Georges Braque)
> 


-- 
-- Jérôme
Dans les situations critiques, quand on parle avec un calibre bien en 
pogne, personne ne conteste plus.
Y'a des statistiques là-dessus.
    (Michel Audiard)

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Mpiblast-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/mpiblast-users





-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Mpiblast-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/mpiblast-users




-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Mpiblast-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/mpiblast-users

Reply via email to