I'm more interesting in the features (regqrding protein-ID, taxon, xref, product) and retrieving information about articles (authors, title). I don't look at all to the sequence data. My purpose is to be able to read the GenBank file to retrieve those information so that I can proceed a conversion to a semantic rdf format file. I'm working on a specific gene at the moment but it would be interesting to extend to any GenBank file in the future.
Thanks, Jean-Charles > Message du 27/10/10 12:41 > De : "Scooter Willis" > A : "jc.lucky" > Copie à : "biojava-l lists open-bio org" > Objet : Re: [Biojava-l] Tr: Retrieve Information from GenBank file > > Jean-Charles > > I have it on my list to do a GenBank parser but haven't had the time. I > can't promise anything in the next couple weeks. Can you send some details > about what a typical use case is for your purpose? Are you trying to get the > sequence data or are you more interested in the features? > > Thanks > > Scooter > > On Wed, Oct 27, 2010 at 4:11 AM, jc.lucky wrote: > > > > > I tried once again with the new version of BioJava but without succeding. > > Any idea or suggestion? > > > > Thanks in advance > > Regards, > > > > Jean-Charles Ferrières > > > > > > > Message du 22/10/10 10:11 > > > De : "jc.lucky" > > > A : [email protected] > > > Copie à : > > > Objet : [Biojava-l] Retrieve Information from GenBank file > > > > > > > > > Hi > > > > > > I'm trying to convert a GenBank file into a rdf file. The gene of > > interest can be found a t : http://www.ncbi.nlm.nih.gov/protein/284794945 > > > > > > With the below code I can read the GenBank file and I manage to retrieve > > information and convert them in a rdf format. However I don't succeed in > > retrieving some information such as Title, protein or product. According to > > this page (http://www.biojava.org/wiki/BioJava:BioJavaXDocs#GenBan)it is > > possible to do so. > > > Please help me find what I do wrong or what should be done to achieve my > > goal. > > > > > > //read the GeneBank File > > > public static RichSequenceIterator readFile(String input, > > > RichSequenceBuilderFactory seqFactory, > > > Namespace ns) > > > throws IOException, NoSuchElementException, BioException > > > { > > > ns = null; > > > InputStream stream = new FileInputStream(input); > > > BufferedReader rdfFile = new BufferedReader(new > > InputStreamReader(stream)); > > > RichSequenceIterator seqs = > > RichSequence.IOTools.readGenbankDNA(rdfFile,ns); > > > return seqs; > > > } > > > > > > //Retrieve information and convert them in rdf format > > > public void writeToRDFFile(RichSequenceIterator rsi, String output) > > > throws IOException, NoSuchElementException, BioException { > > > //create model for the ontology > > > OntModel model = ModelFactory.createOntologyModel(OntModelSpec.OWL_MEM, > > null); > > > OntClass parents; > > > String URI = "http://pbr.wur.nl/#"; > > > > > > while(rsi.hasNext()) > > > { > > > RichSequence seq = rsi.nextRichSequence(); > > > String id = seq.getName(); > > > parents = model.createClass(URI + id); > > > Set author = seq.getRankedDocRefs();//code to clean up Set&convert > > toString > > > String definition = seq.getDescription(); //code to clean up String > > > //Add to model > > > parents.addProperty(DC.description, definition); > > > parents.addProperty(DC.publisher, authors); > > > parents.addComment(taxonomy, "EN"); > > > parents.addProperty(DC.type, organism); > > > //print in rdf format > > > model.write(out, "RDF/XML"); > > > out.close(); } > > > } > > > > > > > > > Thanks, > > > Jean-Charles Ferrières > > _____________________________________________ > > > Biojava-l mailing list - [email protected] > > > http://lists.open-bio.org/mailman/listinfo/biojava-l Une messagerie gratuite, garantie à vie et des services en plus, ça vous tente ? Je crée ma boîte mail www.laposte.net _______________________________________________ Biojava-l mailing list - [email protected] http://lists.open-bio.org/mailman/listinfo/biojava-l
