Hi Arek
The helpdesk team and I have worked together to try to help users by
making the same suggestions you mentioned in your email (i.e
encouraging use of filters and limiting the number of attributes
selected, using "download results via email" option etc..) and I have
also implemented max select in several places in the configuration. I
think we are going to have to look at streamlining the data we provide
in some way in the future. The issue is that the volume of data is
growing, especially for variation and as the tables get bigger the
queries take longer. I know that the load on the server can sometimes
be very high and that this affects user response times. Have you guys
tried partitioning of data to improve build time and/or result
response time and had any success with this?
Regards,
Rhoda
On 20 Sep 2011, at 13:21, Arek Kasprzyk wrote:
Hi Rhoda,
(cc'ing users because this can be of interest to others).
there is no active development on 0.7 anymore. However there are
still some 'generic' tricks you could use to improve your situation;
1. Ask people to go through 'download via email' route for more
heavy queries
2. Limit attributes combination that results in many and heavy table
joins via
a. using 'max select' when configuring mart
b. simply removing some atts
3. Using 'default' filters to limit the queries
However, i would start by checking two things:
1. Load on the server. The performance of the queries are hugely
affected by that and this can be very misleading. If the load is
high even very 'innnocent' queries take ages. If this is the case
perhaps you need more hardware?
2. Type of the heavy queries that people do most often. If you could
tell me what they are perhaps we could come up with a solution that
would target just those queries?
a
On Tue, Sep 20, 2011 at 5:47 AM, Rhoda Kinsella <[email protected]>
wrote:
Hi Arek and Junjun
I have a query about BioMart and perhaps you can give me some advice
about how to solve this or whether something can be added to the
code to rectify it. Basically we are getting an increasing number of
users reporting that they are only getting partial result files or
no result files back when they use biomart and they are complaining
that there was no warning or error message. I have asked our webteam
about a cut off time that they have set for queries to see if this
has been changed. This was put in place some time ago as some
queries were taking too long and killing the servers or people kept
resubmitting the same query over and over and this froze the servers
for everyone else. I was wondering if you have implemented or are
planning to implement some sort of queuing system for queries in the
new code or would it be possible to warn users if they have not got
an incomplete file download. I fear that some users are ploughing
ahead with their work and not realizing they are missing a chunk of
the data. Is there a way that we can automatically warn users that
they are asking for too much data all at once and ask them to apply
more filters? Is there anything that I can do with our current 0.7
version to try to deal with this issue? I'm worried people are going
to start using alternatives to Biomart if this continues. Any help
or advice would be greatly appreciated.
Regards
Rhoda
Rhoda Kinsella Ph.D.
Ensembl Bioinformatician,
European Bioinformatics Institute (EMBL-EBI),
Wellcome Trust Genome Campus,
Hinxton
Cambridge CB10 1SD,
UK.
Rhoda Kinsella Ph.D.
Ensembl Bioinformatician,
European Bioinformatics Institute (EMBL-EBI),
Wellcome Trust Genome Campus,
Hinxton
Cambridge CB10 1SD,
UK.
_______________________________________________
Users mailing list
[email protected]
https://lists.biomart.org/mailman/listinfo/users