[GitHub] michellethomas commented on a change in pull request #5177: Time shift difference

GitBox Fri, 03 Aug 2018 17:20:01 -0700

michellethomas commented on a change in pull request #5177: Time shift 
difference
URL: 
https://github.com/apache/incubator-superset/pull/5177#discussion_r207693050


 ##########
 File path: superset/viz.py
 ##########
 @@ -1219,24 +1205,50 @@ def run_extra_queries(self):
             query_object['to_dttm'] -= delta
 
             df2 = self.get_df_payload(query_object).get('df')
-            if df2 is not None:
-                classed = classes[i % len(classes)]
-                i += 1
+            if df2 is not None and DTTM_ALIAS in df2:
                 label = '{} offset'. format(option)
                 df2[DTTM_ALIAS] += delta
                 df2 = self.process_data(df2)
-                self._extra_chart_data.extend(self.to_series(
-                    df2, classed=classed, title_suffix=label))
+                self._extra_chart_data.append((label, df2))
 
     def get_data(self, df):
+        fd = self.form_data
+        comparison_type = fd.get('comparison_type') or 'values'
         df = self.process_data(df)
-        chart_data = self.to_series(df)
 
-        if self._extra_chart_data:
-            chart_data += self._extra_chart_data
-            chart_data = sorted(chart_data, key=lambda x: tuple(x['key']))
+        if comparison_type == 'values':
+            chart_data = self.to_series(df)
+            for i, (label, df2) in enumerate(self._extra_chart_data):
+                chart_data.extend(
+                    self.to_series(
+                        df2, classed='time-shift-{}'.format(i), 
title_suffix=label))
+        else:
+            chart_data = []
+            for i, (label, df2) in enumerate(self._extra_chart_data):
+                # reindex df2 into the df2 index
+                combined_index = df.index.union(df2.index)
+                df2 = df2.reindex(combined_index) \
+                    .interpolate(method='time') \
+                    .reindex(df.index)
+
+                if comparison_type == 'absolute':
+                    diff = df - df2
+                elif comparison_type == 'percentage':
+                    diff = (df - df2) / df2
+                elif comparison_type == 'ratio':
+                    diff = df / df2
+                else:
+                    raise Exception(
+                        'Invalid `comparison_type`: 
{0}'.format(comparison_type))
 
-        return chart_data
+                # remove leading/trailing NaNs from the time shift difference
+                diff = diff[diff.first_valid_index():diff.last_valid_index()]
+
+                chart_data.extend(
+                    self.to_series(
+                        diff, classed='time-shift-{}'.format(i), 
title_suffix=label))
+
+        return sorted(chart_data, key=lambda x: tuple(x['key']))
 
 Review comment:
   @betodealmeida we've been seeing issues with the stacked line chart where 
the lines are not sorted correctly. The stacked chart has sort_series set to 
true, and that sorting is done before this gets called, so I think this sorting 
is causing the issue. Is it necessary to sort here?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] michellethomas commented on a change in pull request #5177: Time shift difference

Reply via email to