Hi, Brant!

Looking at the source code for gfServer,
it seems to ignore the -mask option for untranslated use.

Even for translated, it's probably just using
it as a way to keep the speed up -- it's still
not providing the kind of masking control
that standalone blat provides.

It does appear that you will need to use standalone blat
if you want to use masking options.  Many users for
untranslated don't even bother with explicit masking.
There is still some "masking" that occurs by virtue
of overused tiles being ignored and thus unable
to initiate a hit.

Of course, you could hard-mask your target database I suppose,
but that's probably unnecessary.

It was unclear to me why you were saying that
you needed gfServer instead of standalone blat
to run many queries.  You can have an enormous
set of queries with standalone blat.
Just make a list

blat database query [-ooc=11.ooc] output.psl

[...]
where:
    database and query are each either a .fa , .nib or .2bit file,
    or a list these files one file name per line.
[...]

You can stick all your queries in one huge .2bit or .fa,
or you can simply create a file that lists all the
sequence-files that you want to query:
  somefile1.2bit
  somefile2.2bit
  someotherfile.fa
  someotherfile2.fa

At UCSC, we mainly use gfServer with hgBlat for online interactive 
queries, and we frequently use standalone blat with no masking options.
Some steps of our automated genbank process do use standalone
blat with masking options.

Thanks for pointing out another quirky difference between blat and 
gfServer/gfClient.

-Galt


On Tue, 14 Jul 2009, Brant Faircloth wrote:

> Hi,
>
> note:  apologies in advance if this gets duplicated.  It didn't post
> after a day, and I figured it may have been blocked due to my pgp sig
> attachment.
>
> First, i just wanted to say thanks for the mailing list and to thank
> everyone for their work on the source tree - it's a great resource
> that I use almost daily!  I've browsed the list for quite some time,
> but have recently run across some strangeness in the behavior of
> gfClient relative to blat.  Likely, the strangeness is of my own
> doing, but I figured I might email to see if that, indeed, was the case.
>
> I'm working from gfClient/Server (v.34x4) and blat (v. 34x4) compiled
> from CVS.  The problem I'm running into deals with alignments starting
> in repeat regions (versus alignments extending over repeats).  Here
> are my gfServer start parameters:
>
> /Users/bcf/bin/i386/gfserver start 127.0.0.1 8888 /Users/bcf/Data/test/
> SoftMask/*.softmask.2bit -mask
>
> where *.softmask.2bit was created from a fasta file of soft-masked
> sequences (from repeatmasker | `maskOutFa -soft`) using faToTwobit.
> these targets also contain the query sequence I am demonstrating
> with.  I am running gfServer because the number of queries for what I
> am attempting is large, and I would prefer to avoid reindexing the
> 2bit file with every call to blat.
>
> my query with gfClient is:
>
> /Users/bcf/bin/i386/gfclient -t=DNA -q=DNA -minScore=0 -minIdentity=0 -
> out=psl 127.0.0.1 8888 / ~/tmp/tmp.fa stdout
>
> where tmp.fa is a single, soft-masked sequence in fasta format.
> tmp.fa has a soft-masked repeat region, extending from position 76-158
> (0-indexed).  The (truncated) gfClient output is:
>
> match mis-    rep.    N's     Q gap   Q gap   T gap   T gap   strand  Q       
>         Q
> Q     Q       T               T       T       T       block   blockSizes      
> qStarts  tStarts
>       match   match           count   bases   count   bases           name    
>         size
> start end     name            size    start   end     count
> ---------------------------------------------------------------------------------------------------------------------------------------------------------------
> 100   2       0       1       0       0       2       2       -       
> FX5ZTWB02D5UGZ  179     76      179     FX5ZTWB02D3DFI  250     30
> 135   3       18,4,81,        0,18,22,        30,49,54,
> 99    2       0       1       1       1       3       5       -       
> FX5ZTWB02D5UGZ  179     76      179     FX5ZTWB02DA9YF  222     102
> 209   4       18,4,68,12,     0,18,22,91,     102,121,126,197,
> 94    2       0       0       1       7       1       9       -       
> FX5ZTWB02D5UGZ  179     76      179     FX5ZTWB02DAKZ3  297     23      128
> 2     12,84,  0,19,   23,44,
> 100   2       0       1       0       0       4       12      -       
> FX5ZTWB02D5UGZ  179     76      179     FX5ZTWB02DBSW8  222     102
> 217   5       18,4,17,51,13,  0,18,22,39,90,  102,121,126,144,204,
> 55    1       0       0       0       0       0       0       -       
> FX5ZTWB02D5UGZ  179     76      132     FX5ZTWB02DBYD5  226     39      95
> 1     56,     47,     39,
> 67    1       0       0       0       0       0       0       -       
> FX5ZTWB02D5UGZ  179     76      144     FX5ZTWB02DJ4YU  231     96      164
> 1     68,     35,     96,
> 100   2       0       1       0       0       2       2       -       
> FX5ZTWB02D5UGZ  179     76      179     FX5ZTWB02DJB25  170     29
> 134   3       18,4,81,        0,18,22,        29,48,53,
> 100   2       0       1       0       0       2       2       -       
> FX5ZTWB02D5UGZ  179     76      179     FX5ZTWB02DVMEF  168     29
> 134   3       18,4,81,        0,18,22,        29,48,53,
> 79    0       0       0       1       3       3       19      -       
> FX5ZTWB02D5UGZ  179     76      158     FX5ZTWB02DWVVC  241     64
> 162   4       13,28,15,23,    21,34,65,80,    64,94,123,139,
> 94    2       0       0       1       7       1       9       -       
> FX5ZTWB02D5UGZ  179     76      179     FX5ZTWB02EGVMB  247     23      128
> 2     12,84,  0,19,   23,44,
> 100   2       0       1       0       0       2       2       -       
> FX5ZTWB02D5UGZ  179     76      179     FX5ZTWB02EHBES  338     39
> 144   3       18,4,81,        0,18,22,        39,58,63,
> 44    0       0       0       0       0       1       1       -       
> FX5ZTWB02D5UGZ  179     76      120     FX5ZTWB02EOC38  213     66      111
> 2     28,16,  59,87,  66,95,
> 100   2       0       1       0       0       2       2       -       
> FX5ZTWB02D5UGZ  179     76      179     FX5ZTWB02ETWES  202     29
> 134   3       18,4,81,        0,18,22,        29,48,53,
> 100   2       0       1       0       0       1       1       -       
> FX5ZTWB02D5UGZ  179     76      179     FX5ZTWB02EZOW2  208     19
> 123   2       18,85,  0,18,   19,38,
>
> A blat run of the form:
>
> blat /Users/bcf/Data/test/SoftMask/*.softmask.clean.2bit tmp.fa -
> mask=lower stdout
>
> returns (full output):
>
> match mis-    rep.    N's     Q gap   Q gap   T gap   T gap   strand  Q       
>         Q
> Q     Q       T               T       T       T       block   blockSizes      
> qStarts  tStarts
>       match   match           count   bases   count   bases           name    
>         size
> start end     name            size    start   end     count
> ---------------------------------------------------------------------------------------------------------------------------------------------------------------
> 31    2       80      0       0       0       1       24      +       
> FX5ZTWB02D5UGZ  179     45      158     FX5ZTWB02EZZ23  182     35
> 172   2       31,82,  45,76,  35,90,
> 32    0       0       0       0       0       0       0       +       
> FX5ZTWB02D5UGZ  179     45      77      FX5ZTWB02EMMWO  294     35      67    
>   1
> 32,   45,     35,
> 32    0       0       0       0       0       0       0       +       
> FX5ZTWB02D5UGZ  179     45      77      FX5ZTWB02EM5LP  153     35      67    
>   1
> 32,   45,     35,
> 32    0       33      0       1       1       1       23      +       
> FX5ZTWB02D5UGZ  179     45      111     FX5ZTWB02ELBHJ  161     34
> 122   2       32,33,  45,78,  34,89,
> 32    0       0       0       0       0       0       0       +       
> FX5ZTWB02D5UGZ  179     45      77      FX5ZTWB02EKORL  159     36      68    
>   1
> 32,   45,     36,
> 32    0       0       0       0       0       0       0       +       
> FX5ZTWB02D5UGZ  179     45      77      FX5ZTWB02EJB29  138     35      67    
>   1
> 32,   45,     35,
> 66    2       0       0       1       8       2       26      +       
> FX5ZTWB02D5UGZ  179     0       76      FX5ZTWB02EJ0PM  301     0       94    
>   3
> 11,26,31,     0,11,45,        0,12,63,
> 68    1       0       0       1       8       1       24      +       
> FX5ZTWB02D5UGZ  179     0       77      FX5ZTWB02EICDX  381     229     322
> 2     37,32,  0,45,   229,290,
> 66    2       0       0       1       8       2       25      +       
> FX5ZTWB02D5UGZ  179     0       76      FX5ZTWB02EH3VT  247     0       93    
>   3
> 11,26,31,     0,11,45,        0,12,62,
> 62    1       0       0       2       13      2       35      +       
> FX5ZTWB02D5UGZ  179     0       76      FX5ZTWB02EGNY4  328     0       98    
>   3
> 24,8,31,      0,29,45,        0,32,67,
> 43    0       0       0       1       8       1       24      +       
> FX5ZTWB02D5UGZ  179     26      77      FX5ZTWB02EG2T5  198     0       67    
>   2
> 11,32,        26,45,  0,35,
> 43    0       0       0       1       8       1       24      +       
> FX5ZTWB02D5UGZ  179     26      77      FX5ZTWB02ECX8Z  224     0       67    
>   2
> 11,32,        26,45,  0,35,
> 32    0       0       0       0       0       0       0       +       
> FX5ZTWB02D5UGZ  179     45      77      FX5ZTWB02EC10O  167     35      67    
>   1
> 32,   45,     35,
> 43    0       0       0       1       8       1       24      +       
> FX5ZTWB02D5UGZ  179     26      77      FX5ZTWB02EBHWR  212     0       67    
>   2
> 11,32,        26,45,  0,35,
> 32    0       0       0       0       0       0       0       +       
> FX5ZTWB02D5UGZ  179     45      77      FX5ZTWB02DYAKJ  141     35      67    
>   1
> 32,   45,     35,
> 32    0       0       0       0       0       0       0       +       
> FX5ZTWB02D5UGZ  179     45      77      FX5ZTWB02DUG6S  181     35      67    
>   1
> 32,   45,     35,
> 32    0       0       0       0       0       0       0       +       
> FX5ZTWB02D5UGZ  179     45      77      FX5ZTWB02DTP6B  182     35      67    
>   1
> 32,   45,     35,
> 43    0       0       0       1       8       1       24      +       
> FX5ZTWB02D5UGZ  179     26      77      FX5ZTWB02DSULL  245     0       67    
>   2
> 11,32,        26,45,  0,35,
> 44    0       37      0       1       8       2       55      +       
> FX5ZTWB02D5UGZ  179     26      115     FX5ZTWB02DPKMK  206     0
> 136   3       11,32,38,       26,45,77,       0,35,98,
> 32    0       0       0       0       0       0       0       +       
> FX5ZTWB02D5UGZ  179     45      77      FX5ZTWB02DPI46  179     35      67    
>   1
> 32,   45,     35,
> 43    0       0       0       1       8       1       24      +       
> FX5ZTWB02D5UGZ  179     26      77      FX5ZTWB02DNZB3  290     0       67    
>   2
> 11,32,        26,45,  0,35,
> 43    0       0       0       1       8       1       24      +       
> FX5ZTWB02D5UGZ  179     26      77      FX5ZTWB02DMW8R  211     0       67    
>   2
> 11,32,        26,45,  0,35,
> 40    0       0       0       1       8       2       27      +       
> FX5ZTWB02D5UGZ  179     26      74      FX5ZTWB02DMQ4E  240     0       67    
>   3
> 11,5,24,      26,45,50,       0,37,43,
> 32    0       0       0       0       0       0       0       +       
> FX5ZTWB02D5UGZ  179     45      77      FX5ZTWB02DKKJB  175     35      67    
>   1
> 32,   45,     35,
> 32    0       0       0       0       0       0       0       +       
> FX5ZTWB02D5UGZ  179     45      77      FX5ZTWB02DI8TE  158     35      67    
>   1
> 32,   45,     35,
> 43    0       0       0       1       8       1       24      +       
> FX5ZTWB02D5UGZ  179     26      77      FX5ZTWB02DGCU6  275     0       67    
>   2
> 11,32,        26,45,  0,35,
> 43    0       0       0       1       8       1       24      +       
> FX5ZTWB02D5UGZ  179     26      77      FX5ZTWB02DB9V0  286     0       67    
>   2
> 11,32,        26,45,  0,35,
> 43    2       78      0       2       9       2       47      +       
> FX5ZTWB02D5UGZ  179     26      158     FX5ZTWB02D92EW  204     0
> 170   3       11,32,80,       26,45,78,       0,35,90,
> 43    0       0       0       1       8       1       24      +       
> FX5ZTWB02D5UGZ  179     26      77      FX5ZTWB02D8YAV  238     0       67    
>   2
> 11,32,        26,45,  0,35,
> 43    0       0       0       1       8       1       24      +       
> FX5ZTWB02D5UGZ  179     26      77      FX5ZTWB02D8RCP  216     0       67    
>   2
> 11,32,        26,45,  0,35,
> 95    0       82      1       1       1       2       4       +       
> FX5ZTWB02D5UGZ  179     0       179     FX5ZTWB02D887O  221     0       182
> 3     11,149,18,      0,11,161,       0,12,164,
> 43    0       0       0       1       8       1       24      +       
> FX5ZTWB02D5UGZ  179     26      77      FX5ZTWB02D83KC  250     0       67    
>   2
> 11,32,        26,45,  0,35,
> 96    0       82      1       0       0       0       0       +       
> FX5ZTWB02D5UGZ  179     0       179     FX5ZTWB02D5UGZ  179     0       179
> 1     179,    0,      0,
> 43    0       0       0       1       8       1       24      +       
> FX5ZTWB02D5UGZ  179     26      77      FX5ZTWB02D5NGS  270     0       67    
>   2
> 11,32,        26,45,  0,35,
> 43    0       0       0       1       8       1       24      +       
> FX5ZTWB02D5UGZ  179     26      77      FX5ZTWB02D35EE  194     0       67    
>   2
> 11,32,        26,45,  0,35,
> 43    0       0       0       1       8       1       24      +       
> FX5ZTWB02D5UGZ  179     26      77      FX5ZTWB02D1GE9  269     0       67    
>   2
> 11,32,        26,45,  0,35,
> 43    0       0       0       1       8       1       24      +       
> FX5ZTWB02D5UGZ  179     26      77      FX5ZTWB02D1EQS  198     0       67    
>   2
> 11,32,        26,45,  0,35,
> 43    0       0       0       1       8       1       24      +       
> FX5ZTWB02D5UGZ  179     26      77      FX5ZTWB02D0168  201     0       67    
>   2
> 11,32,        26,45,  0,35,
> 30    0       0       0       0       0       0       0       +       
> FX5ZTWB02D5UGZ  179     47      77      FX5ZTWB02C9X15  154     39      69    
>   1
> 30,   47,     39,
> 32    0       0       0       0       0       0       0       +       
> FX5ZTWB02D5UGZ  179     45      77      FX5ZTWB02C8WKA  194     35      67    
>   1
> 32,   45,     35,
> 43    0       0       0       1       8       1       24      +       
> FX5ZTWB02D5UGZ  179     26      77      FX5ZTWB02C7D4Y  222     0       67    
>   2
> 11,32,        26,45,  0,35,
> 43    0       0       0       1       8       1       24      +       
> FX5ZTWB02D5UGZ  179     26      77      FX5ZTWB02C6UVQ  263     0       67    
>   2
> 11,32,        26,45,  0,35,
> 66    2       0       0       1       8       2       25      +       
> FX5ZTWB02D5UGZ  179     0       76      FX5ZTWB02C5OY2  249     0       93    
>   3
> 11,26,31,     0,11,45,        0,12,62,
> 66    2       0       0       1       8       1       24      +       
> FX5ZTWB02D5UGZ  179     0       76      FX5ZTWB02C1LC0  179     0       92    
>   2
> 37,31,        0,45,   0,61,
>
>
> It looks like blat is treating the masking correctly in alignments -
> there are no alignments starting in the repeat region (76-158) of the
> Query or the Targets.  Alignments across masked regions that begin in
> unmasked regions are treated as expected (i,e. the self to self (Q=
> FX5ZTWB02D5UGZ to T= FX5ZTWB02D5UGZ) alignment extends through the
> masked region).
>
> Conversely, in the truncated gfClient output, several of the
> alignments listed have a `Q start` to `Q end` within 76-158, which is
> an unexpected result given the use of the `-mask` flag to start an
> instance of gfServer with the soft-masked, 2bit input file.  After
> double-checking the associated Target sequence (and reverse complement
> of the Target) for masked bases, it appears alignments are started in
> repeat-masked regions of these targets.
>
> I noticed the gfServer help indicated that the mask option is to be
> used with nib files, but I assumed since 2bit files were also a valid
> input option (and can be composed of multiple fastas, which I need),
> the `-mask` option applied, as well.  So, the discrepancy in the
> output from blat versus gfClient is what has me confused.  Again, I
> suspect that I've got something wrong here, that my interpretation of
> the expected behavior is incorrect, or that the help is indeed correct
> that gfServer masking is nib only, but I can't quite put my finger on
> the problem.
>
> Thanks for your time,
> brant
>
> ************************************************
> Brant C. Faircloth
> Dept. of Ecology and Evolutionary Biology
> 621 Charles E. Young Drive South
> University of California
> Los Angeles, CA 90095 USA
>
> rooms:   LSS 4304 and 4315
> email:   [email protected]
> lab:     +1.310.206.2270
> office:  +1.310.206.3083
> mobile:  +1.706.201.6110
> ************************************************
>
> < * )
>  (_ \\
>  _ ||
>
>
>
>
>
> _______________________________________________
> Genome maillist  -  [email protected]
> https://lists.soe.ucsc.edu/mailman/listinfo/genome
>
_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to