[GitHub] IanSavchenko opened a new issue #4249: FilterBox visualisation runs one extra query

GitBox Fri, 19 Jan 2018 05:46:43 -0800

IanSavchenko opened a new issue #4249: FilterBox visualisation runs one extra 
query
URL: https://github.com/apache/incubator-superset/issues/4249
 
 
   Make sure these boxes are checked before submitting your issue - thank you!
   
   - [X] I have checked the superset logs for python stacktraces and included 
it here as text if any
   - [X] I have reproduced the issue with at least the latest released version 
of superset
   - [X] I have checked the issue tracker for the same issue and I haven't 
found one similar
   
   
   ### Superset version
   0.22.1
   
   ### Expected results
   For FilterBox visualization (widget) Superset should run one query per 
filter to get possible filter values.
   
   ### Actual results
   Superset makes two queries (actually, N + 1, where N - number of filters in 
the widget)
   
   ### Steps to reproduce
   Create FilterBox, and run the query. You will see one extra query on DB end. 
Easy to reproduce with default dashboard "World's Banks Data" and existing 
FilterBox.
   
   
   This is an issue for us because for one of the queries FilterBox does take 
quite long time (around a minute), but in fact, this time gets doubled in the 
end, because there are at least two queries run.
   
   I managed to find the source of the issue: file `superset/viz.py` lines 
`276, 278`. See code and my comments:
   
   ``` python
   # first query executed here by design, 
   # but it's results are actually ignored in FilterBoxViz subclass method 
`get_data`
   df = self.get_df()    
     if not self.error_message:
       # N queries are executed here
       data = self.get_data(df) 
   ```
   
   Here, in the subclass `FilterBoxViz (lines 1532-1564)`:
   ```python
   # df not used here!
   def get_data(self, df): 
           qry = self.query_obj()
           filters = [g for g in self.form_data['groupby']]
           d = {}
           for flt in filters:
               qry['groupby'] = [flt]
               
               # N "legit" queries are executed in the loop here
               df = super(FilterBoxViz, self).get_df(qry)
               d[flt] = [{
                   'id': row[0],
                   'text': row[0],
                   'filter': flt,
                   'metric': row[1]}
                   for row in df.itertuples(index=False)
               ]
           return d
   ```
   
   I'm not a Python dev and not really sure how to fix this. I would override 
some other methods in `FilterBoxViz` subclass, but since `get_df` is used in 
more methods like `get_csv` (it must be also broken now for FilterBox, btw), I 
don't know what is the right design. If nobody steps in, I will try to make a 
PR, but this fix should be trivial for those who know this code.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

[GitHub] IanSavchenko opened a new issue #4249: FilterBox visualisation runs one extra query

Reply via email to