Hiram and All,

Granted that the sequences that were obviously not human were not of  
interest to the assemblers, but they do represent a wealth of  
information on a variety of levels.  So the general consensus is that  
these non-human contaminant sequences are unavailable, correct?

As a rule, I keep all of my data regardless of my own interest as it  
often times is useful for other reasons much later.  I would be  
disappointed if the assemblers of the human genome would have trashed  
this data.

In the mean time I will do as you suggest Hiram and wade through the  
old ChrUn for the odd contaminant sequence, but so far (I've analyzed  
approximately 1/3 of the total data) they seem to be human derived  
sequences.

Thank you again for the assistance,

Jeremy

On Apr 2, 2009, at 10:27 AM, Hiram Clawson wrote:

> Good Morning Jeremy:
>
> I believe you have answered your own question.  The contamination
> sequences have been removed by the assemblers.  They do this by
> checking a sequence in question with the contents of all sequences
> in genbank.  The chrUn sequences in earlier human assemblies should
> also be free of contamination, or else it would be unknown  
> contamination.
> The newer human assemblies are free of chrUn since the sequence has  
> either
> been localized to at least a chromosome, or it was identified as  
> contamination
> and been thrown away.  I guess you could take the previous chrUn  
> sequences,
> break it up into small pieces, and then blat it against current
> assemblies.  Bits that do not match could be this contamination you
> are looking form.
>
> The latest assembly, currently under construction here:
>        http://genome-test.cse.ucsc.edu/cgi-bin/hgGateway?db=hg19
> has a number of unplaced and unlocalized bits that normally would
> have been put together into the chrUn.  For hg19 we are not
> going to place them in chrUn.  You will see their names as:  
> chrUn_gl000nnn
> take a look here:
>       http://genome-test.cse.ucsc.edu/cgi-bin/hgTracks? 
> db=hg19&chromInfoPage=
>
> --Hiram
>
> Jeremy Ellis wrote:
>> Hiram and All;
>> I appreciate the responses.  I am interested in the bacterial  
>> contamination sequences as indicated here in section VI:
>>> http://www.ncbi.nlm.nih.gov/genome/assembly/assembly.shtml
>> It states:
>> "contamination: All assemblies should be screened for foreign and  
>> vector sequences. The source of these foreign sequences can range  
>> from bacterial genome contamination (due to propagating clones in  
>> bacteria) to contamination from other projects being sequenced at  
>> a particular sequencing center."
>> Not all of these contaminant sequences would be from the bacteria  
>> that the clones were propagated with, but there are likely  
>> sequences from normal bacterial/organismal flora from the donor  
>> human that were cloned and sequenced as well (not to mention  
>> purely random genomic fragments from a wide variety of sources  
>> (pollen, water contamination, etc).
>> I have looked through ChrUn from both hg16 and hg15 (hg17 and 18  
>> do not have the ChrUn data) and there does not appear to be any of  
>> the bacterial contaminant sequences in this data (it looks like it  
>> is information from rare PCR products and other cloning  
>> artifacts).  So, my question is simply, "Where are the non-human  
>> contaminant sequences?".
>> I hope this clarifies my question.
>> Jeremy
>> On Apr 1, 2009, at 4:57 PM, Hiram Clawson wrote:
>>> Good Afternoon Jeremy:
>>>
>>> You may find the following discussion of interest:
>>>
>>> http://www.ncbi.nlm.nih.gov/genome/assembly/assembly.shtml
>>>
>>>
>>>> Jeremy Ellis wrote:
>>>>> Hello all again.  I appreciate the responses I had for my  
>>>>> first  question and they helped.  I have been looking through  
>>>>> ChrUn from the  earlier assemblies and I now realize that this  
>>>>> isn't quite what I  expected.  Most of these sequences (so far)  
>>>>> appear to be odd human- like sequences due to a variety of  
>>>>> probable reasons (PCR/cloning  artifacts, etc).  I think that  
>>>>> the sequences I am interested in is  the stuff that might have  
>>>>> been thrown out as it appeared to be a  contaminant sequence  
>>>>> from bacteria, fungi, or water borne protozoa,  etc.  Would  
>>>>> these sequences have been long since disposed of and  ignored  
>>>>> or could there still be hope for me in finding a treasure   
>>>>> trove of "garbage" sequence?
>>>>>
>>>>> Thank you again for your help!
>>>>>
>>>>> J.
>>>
>> Jeremy Ellis
>> [email protected]
>> 949-824-1223
>> Arora Lab
>> Developmental and Cell Biology
>> University of California, Irvine
>

Jeremy Ellis
[email protected]
949-824-1223
Arora Lab
Developmental and Cell Biology
University of California, Irvine


_______________________________________________
Genome maillist  -  [email protected]
http://www.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to