On 18 Mar 2005, at 09:20, Joel Björkman wrote:

Hello!

I'm new to org.biojava.bio.program.das and I have a couple of
questions regarding fetching features and sequences from dazzle
servers...

At the moment I'm only interested in getting the sequence and
annotations from ensembl's database, which should make the problem
easier.

It's quite obvious when you're browsing
http://servlet.sanger.ac.uk:8080/das/ that ensemble got a lot of data.

What I'm concerned about is when I look at the different segment sizes
<SEGMENT id="21" size="46944323" subparts="yes"/>
Some of them are really big.

Taking a peek at the demo that follows with biojava-live shows that
you are supposed to download the entire sequence and then make
operations on it (eg substring).

DASSequenceDB dasDB = new DASSequenceDB(dbURL);
DASSequence dasSeq = (DASSequence) dasDB.getSequence(seqName);
System.out.println("1st 10 bases: " + dasSeq.subStr(1, 10));

Isn't there any way that's more efficient and nicer to the server? Is
there any way to throw requests like this:
http://servlet.sanger.ac.uk:8080/das/ensembl_Homo_sapiens_core_28_35a/ dna?segment=21:1,10

Hi,

The "getSequence" call in that program doesn't actually retrieve all the sequence data from the DAS server, it just creates an object which can make calls to the DAS server as necessary.

Currently, the actual DAS client code which is hidden behind the Sequence object you get back fetches the sequence data in chunks from the DAS server (50kb chunks, if I remember correctly) -- you certainly won't end up pulling down the full sequence of chr21 just to look at a few bases.

Another issue with DAS is that although the XML documents -- especially the feature tables -- can be pretty huge, they compress down well. Given a suitable server implementation (Dazzle can certainly do this, not sure how many others can), the BioJava DAS client can use the HTTP "Accept-Encoding" mechanism to negotiate GZIP compression of all the DAS XML. Saves a lot of bandwidth.

         Thomas.


_______________________________________________ Biojava-l mailing list - Biojava-l@biojava.org http://biojava.org/mailman/listinfo/biojava-l

Reply via email to