Just for kicks, check out this sequence I found in the process (conjecture: maybe when the virus causes its synthesis, it uses up all the cysteines/methionines!):
>sp|Q69566|U88_HHV6U Uncharacterized protein U88 OS=Human herpesvirus 6A >(strain Uganda-1102) GN=U88 PE=4 SV=1 MYVSVSVHVSVHVSVRVSVRVSVCVSVRVSVHVSVRVSVSVRVSVRVSVSVRVSVRVSVSVHVSVRVSVRVSVSVRVSVCARVCARVCVCARVCVCARVCVCARVCVCARVCARVCVCACVCVCACLCVCACLCVCACLCVCACLCVCACLCVCACLCVCACLCVCACLCVCVCVCLCVCVCLCVCVCLCVCVCLCVCVCLCVCVCLCVCVCLCVCVCLCVCVCLCVCVCLCVCVCVCVCVCVCVCVCVCVCVCVCLCVCVCLCVCLCVCLCVCVCVCVCLCVCLCVCLCVCVCVCVCLLCMSLCMCMCMCMCMCMCMCMCMSLCMSLCMCMCMCMCMCMCICMCMCICICMCMCMCMCMCMCMCMCMCMCMCMCMCMCMCMCMCMCMCMCMCMCMCMCMCMCMCMCIIEGNK Maybe it's just a sequencing glitch? JPK On Tue, Oct 4, 2011 at 4:05 PM, Jacob Keller <[email protected]> wrote: > Thanks everybody, I tried using > > --toolkit tuebingen mpi > --Scanprosite > > I think my regex syntax was different from the Tuebingen site's, but > scanprosite worked well and found many hits, although without really > hitting paydirt. I think both of these programs would do the job well, > though. > > Thanks very much for your speedy help (this BB is truly amazing!), > > Jacob > > > > On Tue, Oct 4, 2011 at 3:47 PM, David Briggs <[email protected]> wrote: >> Hi Jacob, >> SCAN PROSITE >> http://prosite.expasy.org/scanprosite/ >> will do precisely what you want. >> C-X-C-X-C-X-C >> or >> C-X-C-X-C >> would be the pattern using Prosite syntax. >> Cheers, >> Dave >> ============================ >> David C. Briggs PhD >> Father, Structural Biologist and Sceptic >> ============================ >> University of Manchester E-mail: >> [email protected] >> ============================ >> http://manchester.academia.edu/DavidBriggs (v.sensible) >> http://xtaldave.wordpress.com/ (sensible) >> http://xtaldave.posterous.com/ (less sensible) >> Twitter: @xtaldave >> Skype: DocDCB >> ============================ >> >> >> On 4 October 2011 21:34, Jacob Keller <[email protected]> >> wrote: >>> >>> Dear Crystallographers, >>> >>> I cannot get BLAST to find all proteins with the motif cxcxcxc or at >>> least cxcxc. It seems to think of "x" as an actual amino acid rather >>> than a wildcard. There must be some easy way to do this? Ordinarily to >>> find a short motif, I would just paste the sequence and get the >>> answer, but here the C's are an absolute requirement and there is no >>> constraint on the x's except that they be only one residue. >>> >>> JPK >>> >>> -- >>> ******************************************* >>> Jacob Pearson Keller >>> Northwestern University >>> Medical Scientist Training Program >>> cel: 773.608.9185 >>> email: [email protected] >>> ******************************************* >> >> > > > > -- > ******************************************* > Jacob Pearson Keller > Northwestern University > Medical Scientist Training Program > cel: 773.608.9185 > email: [email protected] > ******************************************* > -- ******************************************* Jacob Pearson Keller Northwestern University Medical Scientist Training Program cel: 773.608.9185 email: [email protected] *******************************************
