Works like a charm now!!! :) I figured it was a typo somewhere on Friday, but couldn't find the source. I didn't think tag info was case sensitive.
On 6/12/06, Richard Holland <[EMAIL PROTECTED]> wrote: > > Typo in code. my fault. Try again! > > > > On Thu, 2006-06-08 at 10:23 -0400, Seth Johnson wrote: > > I'm still getting an empty array back from this: > > > > Note [] myAccs = ((RichAnnotation)rs.getAnnotation()).getProperties > > (INSDseqFormat.Terms.getOtherSeqIdTerm()); > > > > Here's the file that I'm parsing: > > ~~~~~~~~~~~~~~~~~~~~~~ > > <?xml version="1.0"?> > > <!DOCTYPE INSDSet PUBLIC "-//NCBI//INSD INSDSeq/EN" > > "http://www.ncbi.nlm.nih.gov/dtd/INSD_INSDSeq.dtd"> > > <INSDSet> > > <INSDSeq> > > <INSDSeq_locus>AY069118</INSDSeq_locus> > > <INSDSeq_length>1502</INSDSeq_length> > > <INSDSeq_strandedness>single</INSDSeq_strandedness> > > <INSDSeq_moltype>mRNA</INSDSeq_moltype> > > <INSDSeq_topology>linear</INSDSeq_topology> > > <INSDSeq_division>INV</INSDSeq_division> > > <INSDSeq_update-date>17-DEC-2001</INSDSeq_update-date> > > <INSDSeq_create-date>15-DEC-2001</INSDSeq_create-date> > > <INSDSeq_definition>Drosophila melanogaster GH13089 full length > > cDNA</INSDSeq_definition> > > <INSDSeq_primary-accession>AY069118</INSDSeq_primary-accession> > > <INSDSeq_accession-version> AY069118.1</INSDSeq_accession-version> > > <INSDSeq_other-seqids> > > <INSDSeqid>gb|AY069118.1|</INSDSeqid> > > <INSDSeqid>gi|17861571</INSDSeqid> > > </INSDSeq_other-seqids> > > <INSDSeq_keywords> > > <INSDKeyword>FLI_CDNA</INSDKeyword> > > </INSDSeq_keywords> > > <INSDSeq_source>Drosophila melanogaster (fruit > > fly)</INSDSeq_source> > > <INSDSeq_organism>Drosophila melanogaster</INSDSeq_organism> > > <INSDSeq_taxonomy>Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; > > Pterygota; Neoptera; Endopterygota; Diptera; Brachycera; Muscomorpha; > > Ephydroidea; Drosophilidae; Drosophila</INSDSeq_taxonomy> > > <INSDSeq_references> > > <INSDReference> > > <INSDReference_reference>1 (bases 1 to > > 1502)</INSDReference_reference> > > <INSDReference_position>1..1502</INSDReference_position> > > <INSDReference_authors> > > <INSDAuthor>Stapleton,M.</INSDAuthor> > > <INSDAuthor>Brokstein,P.</INSDAuthor> > > <INSDAuthor>Hong,L.</INSDAuthor> > > <INSDAuthor>Agbayani,A.</INSDAuthor> > > <INSDAuthor>Carlson,J.</INSDAuthor> > > <INSDAuthor>Champe,M.</INSDAuthor> > > <INSDAuthor>Chavez,C.</INSDAuthor> > > <INSDAuthor>Dorsett,V.</INSDAuthor> > > <INSDAuthor>Farfan,D.</INSDAuthor> > > <INSDAuthor>Frise,E.</INSDAuthor> > > <INSDAuthor>George,R.</INSDAuthor> > > <INSDAuthor>Gonzalez,M.</INSDAuthor> > > <INSDAuthor>Guarin,H.</INSDAuthor> > > <INSDAuthor>Li,P.</INSDAuthor> > > <INSDAuthor>Liao,G.</INSDAuthor> > > <INSDAuthor>Miranda,A.</INSDAuthor> > > <INSDAuthor>Mungall,C.J.</INSDAuthor> > > <INSDAuthor>Nunoo,J.</INSDAuthor> > > <INSDAuthor>Pacleb,J.</INSDAuthor> > > <INSDAuthor>Paragas,V.</INSDAuthor> > > <INSDAuthor>Park,S.</INSDAuthor> > > <INSDAuthor>Phouanenavong,S.</INSDAuthor> > > <INSDAuthor>Wan,K.</INSDAuthor> > > <INSDAuthor>Yu,C.</INSDAuthor> > > <INSDAuthor>Lewis,S.E.</INSDAuthor> > > <INSDAuthor>Rubin, G.M.</INSDAuthor> > > <INSDAuthor>Celniker,S.</INSDAuthor> > > </INSDReference_authors> > > <INSDReference_title>Direct Submission</INSDReference_title> > > <INSDReference_journal>Submitted (10-DEC-2001) Berkeley > > Drosophila Genome Project, Lawrence Berkeley National Laboratory, One > > Cyclotron Road, Berkeley, CA 94720, USA</INSDReference_journal> > > </INSDReference> > > </INSDSeq_references> > > <INSDSeq_comment>Sequence submitted by: Berkeley Drosophila Genome > > Project Lawrence Berkeley National Laboratory Berkeley, CA 94720 This > > clone was sequenced as part of a high-throughput process to sequence > > clones from Drosophila Gene Collection 1 (Rubin et al., Science 2000). > > The sequence has been subjected to integrity checks for sequence > > accuracy, presence of a polyA tail and contiguity within 100 kb in the > > genome. Thus we believe the sequence to reflect accurately this > > particular cDNA clone. However, there are artifacts associated with > > the generation of cDNA clones that may have not been detected in our > > initial analyses such as internal priming, priming from contaminating > > genomic DNA, retained introns due to reverse transcription of > > unspliced precursor RNAs, and reverse transcriptase errors that result > > in single base changes. For further information about this sequence, > > including its location and relationship to other sequences, please > > visit our Web site ( http://fruitfly.berkeley.edu) or send email to > > [EMAIL PROTECTED]</INSDSeq_comment> > > <INSDSeq_feature-table> > > <INSDFeature> > > <INSDFeature_key>source</INSDFeature_key> > > <INSDFeature_location>1..1502</INSDFeature_location> > > <INSDFeature_intervals> > > <INSDInterval> > > <INSDInterval_from>1</INSDInterval_from> > > <INSDInterval_to>1502</INSDInterval_to> > > <INSDInterval_accession> AY069118.1</INSDInterval_accession> > > </INSDInterval> > > </INSDFeature_intervals> > > <INSDFeature_quals> > > <INSDQualifier> > > <INSDQualifier_name>organism</INSDQualifier_name> > > <INSDQualifier_value>Drosophila > > melanogaster</INSDQualifier_value> > > </INSDQualifier> > > <INSDQualifier> > > <INSDQualifier_name>mol_type</INSDQualifier_name> > > <INSDQualifier_value>mRNA</INSDQualifier_value> > > </INSDQualifier> > > <INSDQualifier> > > <INSDQualifier_name>strain</INSDQualifier_name> > > <INSDQualifier_value>y; cn bw sp</INSDQualifier_value> > > </INSDQualifier> > > <INSDQualifier> > > <INSDQualifier_name>db_xref</INSDQualifier_name> > > <INSDQualifier_value>taxon:7227</INSDQualifier_value> > > </INSDQualifier> > > <INSDQualifier> > > <INSDQualifier_name>map</INSDQualifier_name> > > <INSDQualifier_value>39B3-39B3</INSDQualifier_value> > > </INSDQualifier> > > </INSDFeature_quals> > > </INSDFeature> > > <INSDFeature> > > <INSDFeature_key>gene</INSDFeature_key> > > <INSDFeature_location>1..1502</INSDFeature_location> > > <INSDFeature_intervals> > > <INSDInterval> > > <INSDInterval_from>1</INSDInterval_from> > > <INSDInterval_to>1502</INSDInterval_to> > > <INSDInterval_accession> AY069118.1</INSDInterval_accession> > > </INSDInterval> > > </INSDFeature_intervals> > > <INSDFeature_quals> > > <INSDQualifier> > > <INSDQualifier_name>gene</INSDQualifier_name> > > <INSDQualifier_value>E2f2</INSDQualifier_value> > > </INSDQualifier> > > <INSDQualifier> > > <INSDQualifier_name>note</INSDQualifier_name> > > <INSDQualifier_value>alignment with genomic scaffold > > AE003669</INSDQualifier_value> > > </INSDQualifier> > > <INSDQualifier> > > <INSDQualifier_name>db_xref</INSDQualifier_name> > > > > <INSDQualifier_value>FLYBASE:FBgn0024371</INSDQualifier_value> > > </INSDQualifier> > > </INSDFeature_quals> > > </INSDFeature> > > <INSDFeature> > > <INSDFeature_key>CDS</INSDFeature_key> > > <INSDFeature_location>189..1301</INSDFeature_location> > > <INSDFeature_intervals> > > <INSDInterval> > > <INSDInterval_from>189</INSDInterval_from> > > <INSDInterval_to>1301</INSDInterval_to> > > <INSDInterval_accession> AY069118.1</INSDInterval_accession> > > </INSDInterval> > > </INSDFeature_intervals> > > <INSDFeature_quals> > > <INSDQualifier> > > <INSDQualifier_name>gene</INSDQualifier_name> > > <INSDQualifier_value>E2f2</INSDQualifier_value> > > </INSDQualifier> > > <INSDQualifier> > > <INSDQualifier_name>note</INSDQualifier_name> > > <INSDQualifier_value>Longest ORF</INSDQualifier_value> > > </INSDQualifier> > > <INSDQualifier> > > <INSDQualifier_name>codon_start</INSDQualifier_name> > > <INSDQualifier_value>1</INSDQualifier_value> > > </INSDQualifier> > > <INSDQualifier> > > <INSDQualifier_name>transl_table</INSDQualifier_name> > > <INSDQualifier_value>1</INSDQualifier_value> > > </INSDQualifier> > > <INSDQualifier> > > <INSDQualifier_name>product</INSDQualifier_name> > > <INSDQualifier_value>GH13089p</INSDQualifier_value> > > </INSDQualifier> > > <INSDQualifier> > > <INSDQualifier_name>protein_id</INSDQualifier_name> > > <INSDQualifier_value>AAL39263.1</INSDQualifier_value> > > </INSDQualifier> > > <INSDQualifier> > > <INSDQualifier_name>db_xref</INSDQualifier_name> > > <INSDQualifier_value>GI:17861572</INSDQualifier_value> > > </INSDQualifier> > > <INSDQualifier> > > <INSDQualifier_name>db_xref</INSDQualifier_name> > > > > <INSDQualifier_value>FLYBASE:FBgn0024371</INSDQualifier_value> > > </INSDQualifier> > > <INSDQualifier> > > <INSDQualifier_name>translation</INSDQualifier_name> > > > > > <INSDQualifier_value>MYKRKTASIVKRDSSAAGTTSSAMMMKVDSAETSVRSQSYESTPVSMDTSPDPPTPIKSPSNSQSQSQPGQQRSVGSLVLLTQKFVDLVKANEGSIDLKAATKILDVQKRRIYDITNVLEGIGLIDKGRHCSLVRWRGGGFNNAKDQENYDLARSRTNHLKMLEDDLDRQLEYAQRNLRYVMQDPSNRSYAYVTRDDLLDIFGDDSVFTIPNYDEEVDIKRNHYELAVSLDNGSAIDIRLVTNQGKSTTNPHDVDGFFDYHRLDTPSPSTSSHSSEDGNAPACAGNVITDEHGYSCNPGMKDEMKLLENELTAKIIFQNYLSGHSLRRFYPDDPNLENPPLLQLNPPQEDFNFALKSDEGICELFDVQCS</INSDQualifier_value> > > > </INSDQualifier> > > </INSDFeature_quals> > > </INSDFeature> > > </INSDSeq_feature-table> > > > > > <INSDSeq_sequence>AAGAATAGAGGGAGAATGAAAAAAATGACATAAATGGCGGAAAGCAAACCTAGCGCCAACATTCGTATTTTCGTTTAATTTTCGCTCCAAAGTGCAATTAATTCCGGCTTCTTGATCGCTGCATATTGAGTGCAGCCACGCAAAGAGTTACAAGGACAGGAGTATAGTCATCGAGTCGATTGCGGACCATGTACAAGCGCAAAACCGCGAGTATTGTTAAAAGAGACAGCTCCGCAGCGGGCACCACCTCCTCGGCTATGATGATGAAGGTGGATTCGGCTGAGACTTCGGTCCGGTCGCAGAGCTACGAGTCTACACCCGTTAGCATGGACACATCACCGGATCCTCCAACGCCAATCAAGTCTCCGTCGAATTCACAATCGCAATCGCAGCCTGGACAACAGCGCTCCGTGGGCTCACTGGTCCTGCTCACACAGAAGTTTGTGGATCTCGTGAAGGCCAACGAAGGATCCATCGACCTGAAAGCGGCAACCAAAATCTTGGACGTACAGAAGCGCCGAATATACGATATTACCAATGTTTTAGAGGGCATTGGACTAATTGATAAGGGCAGACACTGCTCCCTAGTGCGCTGGCGCGGAGGGGGCTTTAACAATGCCAAGGACCAAGAGAACTACGACCTGGCACGTAGCCGGACTAATCATTTGAAAATGTTGGAGGATGACCTAGACAGGCAACTGGAGTATGCACAGCGCAATCTGCGCTACGTTATGCAGGATCCCTCGAATAGGTCGTATGCATATGTGACACGTGATGATCTGCTGGACATCTTTGGAGATGATTCCGTATTCACAATACCTAATTATGACGAGGAAGTAGATATCAAGCGTAATCATTACGAGCTGGCCGTGTCGCTGGACAATGGCAGCGCAATTGACATTCGCCTGGTGACGAACCAAGGAAAGAGTACTACAAATCCGCACGATGTGGATGGGTTCTTTGACTATC! ACCGTCTGGACACGCCCTCACCCTCGACGTCGTCGCACTCCAGCGAGGATGGTAACGCTCCAGCATGCGCGGGGAACGTGATCACCGACGAGCACGGTTACTCGTGCAATCCCGGGATGAAAGATGAGATGAAACTTTTGGAGAACGAGCTGACGGCCAAGATAATCTTCCAAAATTATCTGTCCGGTCATTCGCTGCGGCGATTTTATCCCGATGATCCGAATCTAGAAAACCCGCCGCTGCTGCAGCTGAATCCTCCGCAGGAAGACTTCAACTTTGCGTTAAAAAGCGACGAAGGTATTTGCGAGCTGTTTGATGTTCAGTGCTCCTAACTGTGGAAGGGGATGTACACCTTAGGACTATAGCTACACTGCAACTGGCCGCGTGCATTGTGCAAATATTTATGATTAGTACAATTTTGACTTTGGATTTCTCTATATCGTCTAGAAATTTTTAATTAGTGTAATACCTTGTAATTTCGCAAATAACAGCAAAACCAATAAATTCGTAAATGCAAAAAAAAAAAAAAAAAA</INSDSeq_sequence> > > > </INSDSeq> > > </INSDSet> > > ~~~~~~~~~~~~~~~~~~~~~~ > > > > On 6/8/06, Richard Holland <[EMAIL PROTECTED]> wrote: > > Yesterday I think I said I was going to add other-seqids but I > > forgot to > > do it, so I did it just now. Try it and see. Use the new > > INSDseqFormat.Terms.getOtherSeqIdTerm() term to find them. > > > > cheers, > > Richard > > > > On Wed, 2006-06-07 at 19:48 -0400, Seth Johnson wrote: > > > Hi Richard, > > > > > > I still cannot locate the GI number for the main > > sequence. After I > > > parse it with readINSDseqDNA, I then use: > > > > > > Note [] myAccs = > > ((RichAnnotation)rs.getAnnotation > > > ()).getProperties( Terms.getAdditionalAccessionTerm ()); > > > > > > However, the 'myAccs' appears to be empty. Am I on the > > wrong track to > > > get to other-seqids??? > > > > > > On 6/6/06, Richard Holland < [EMAIL PROTECTED]> > > wrote: > > > GenBank has a separate line for GI number, so it can > > be parsed > > > out > > > nicely. INSDseq does not, so you have to rely on the > > other- > > > seqids tag > > > and hope that one of them is the GI number. However > > it seems I > > > have not > > > included that tag in the parser, so I will include > > it. This > > > will make > > > the other-seqids values available through the notes > > with the > > > term > > > Terms.getAdditionalAccessionTerm(), but > > getIdentifier() will > > > remain > > > null. > > > > > > For your second question, the tutorial makes the > > mistake in > > > several > > > places of saying getNoteSet( Terms.blahblah()). This > > was > > > shorthand for: > > > > > > rs.getAnnotation().getProperty(Terms.blahblah()) > > > (for single values) > > > > > > or > > > > > > ((RichAnnotation)rs.getAnnotation()).getProperties > > > ( Terms.blahblah ()) > > > (for multiple values) > > > > > > but never got expanded. Maybe someone can fix that > > one > > > day... :)ded... > > > > > > I'm just updating INSDseq to 1.4 now. The guys next > > door gave > > > me the > > > details of the changes, and told me that 1.3 is > > actually no > > > longer > > > supported by them after Friday this week! So I'll > > make it 1.4 > > > only. > > > > > > cheers, > > > Richard > > > > > -- > > Richard Holland (BioMart Team) > > EMBL-EBI > > Wellcome Trust Genome Campus > > Hinxton > > Cambridge CB10 1SD > > UNITED KINGDOM > > Tel: +44-(0)1223-494416 > > > > > > > > > > -- > > Best Regards, > > > > > > Seth Johnson > > Senior Bioinformatics Associate > > > > Ph: (202) 470-0900 > > Fx: (775) 251-0358 > -- > Richard Holland (BioMart Team) > EMBL-EBI > Wellcome Trust Genome Campus > Hinxton > Cambridge CB10 1SD > UNITED KINGDOM > Tel: +44-(0)1223-494416 > > -- Best Regards, Seth Johnson Senior Bioinformatics Associate Ph: (202) 470-0900 Fx: (775) 251-0358 -- Best Regards, Seth Johnson Senior Bioinformatics Associate Ph: (202) 470-0900 Fx: (775) 251-0358 _______________________________________________ Biojava-l mailing list - [email protected] http://lists.open-bio.org/mailman/listinfo/biojava-l
