Hi James,
I agree with you on your comment, do you think that it will be a
solution to fetch all the data at once ?
Regards,
Ramzi
James W. MacDonald wrote:
> Hi Ramzi,
>
> That will work in the short term, and especially if you don't have
> many queries, but in the long term is not a reasonable workaround.
> This is because of two things; first, you can get intermittent
> connection errors to the Biomart server which will break your loop in
> the middle somewhere and you will have to do the whole thing over
> again. Second, people who have public databases do not look kindly
> upon those who abuse their servers (and repeatedly making small
> queries instead of a single large query is considered abuse), and have
> been known to ban IP addresses of people who abuse their server.
>
> Best,
>
> Jim
>
>
>
> Ramzi TEMANNI wrote:
>> Hi James,
>> Thanks for your reply,
>> As i have short reads of 50b, I've modified the code accordigly to
>> your suggestion:
>> gn.m1<-getBM(attributes= c("hgnc_symbol"),
>> filters=c("chromosome_name","
>> start",*"end"*),
>>
>> values=list(t.cpd[1:10,1],as.numeric(t.cpd[1:10,2]),*as.numeric(t.cpd[1:10,2])+50*),
>>
>> mart=ensembl)
>>
>> But still having more genes than expected:
>> hgnc_symbol
>> 1 UBE2G1
>> 2 DUSP5P
>> 3 HIST3H2BA
>> 4 ZNF847P
>> 5 ACTBP11
>> 6 ZC3H11B
>> ........
>> 8488 PPP2R2C
>> 8489 WFS1
>> 8490 SNORD73A
>> 8491 SNORA24
>> 8492 SNORA26
>>
>> Did I miss something ?
>>
>> Thanks for the remark regarding the position, you are right ! I did
>> not notice that, have to get back to the bowtie alignment file and
>> see what is the reason behind that. I'm using bowtie with the last
>> hg19 ref.
>>
>> Thanks again for your comment.
>> Best Regards,
>> Ramzi
>>
>> ----------------------------------------------------------------
>>
>>
>> On Mon, Dec 7, 2009 at 4:20 PM, James W. MacDonald
>> <[email protected] <mailto:[email protected]>> wrote:
>>
>> Hi Ramzi,
>>
>>
>> Ramzi TEMANNI wrote:
>>
>> Hi,
>> I want to extract the gene names knowing the chromosome and the
>> position for
>> each genes:
>>
>> t.cpd[1:10,1:2]
>>
>> CHR.M1 POS.M1
>> [1,] "12" "140059033"
>> [2,] "19" "164634640"
>> [3,] "10" "32347784"
>> [4,] "11" "30576841"
>> [5,] "2" "86479831"
>> [6,] "12" "237019866"
>> [7,] "4" "76487174"
>> [8,] "20" "136121868"
>> [9,] "2" "6255547"
>> [10,] "1" "67658137"
>>
>> i use the following commands:
>> library(biomaRt)
>> mart = useMart("ensembl")
>> ensembl = useDataset("hsapiens_gene_ensembl", mart = mart)
>> gn.m1<-getBM(attributes= c("hgnc_symbol"),
>> filters=c("chromosome_name","start"),
>> values=list(t.cpd[1:10,1],t.cpd[1:10,2]), mart=ensembl)
>>
>> I'm expecting having a list of 10 genes names, but instead i get
>> 8652 genes:
>> hgnc_symbol
>> 1 OR2M1P
>> 2 OR2L1P
>> 3 HSD17B7P1
>> 4 OR14L1P
>> 5 OR2W5
>> 6 VN1R5
>> ......
>> 8649 WFS1
>> 8650 SNORD73A
>> 8651 SNORA24
>> 8652 SNORA26
>>
>> Did I miss something ?
>>
>>
>> Yes. You are giving the start position, but not the end. Without
>> explicitly telling the Biomart server where to stop looking for
>> genes, where do you think it will stop by default?
>>
>> Also, several of your coordinates are nonsensical. For instance,
>> chr12 is only 133851859 bases long, chr20 is 63025520 bases long,
>> etc.
>>
>> Best,
>>
>> Jim
>>
>>
>>
>> Thanks in advance for your help
>>
>> Best Regards,
>> Ramzi
>>
>> ----------------------------------------------------------------
>>
>> [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> Bioc-sig-sequencing mailing list
>> [email protected]
>> <mailto:[email protected]>
>> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
>>
>>
>> -- James W. MacDonald, M.S.
>> Biostatistician
>> Douglas Lab
>> University of Michigan
>> Department of Human Genetics
>> 5912 Buhl
>> 1241 E. Catherine St.
>> Ann Arbor MI 48109-5618
>> 734-615-7826
>> **********************************************************
>> Electronic Mail is not secure, may not be read every day, and should
>> not be used for urgent or sensitive issues
[[alternative HTML version deleted]]
_______________________________________________
Bioc-sig-sequencing mailing list
[email protected]
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing