Mforns has uploaded a new change for review.

  https://gerrit.wikimedia.org/r/277215

Change subject: Make reportupdater support removing columns
......................................................................

Make reportupdater support removing columns

Some queries with dynamic columns can benefit from this feature.
After this change query results do not need to include all
columns featured in previous results. The old data will be kept
intact, and new values for those 'forgotten' columns will be None.

Bug: T127326
Change-Id: I7df91e1164da6ea2da5d16742259ff48b0891734
---
M reportupdater/writer.py
M test/writer_test.py
2 files changed, 17 insertions(+), 9 deletions(-)


  git pull ssh://gerrit.wikimedia.org:29418/analytics/reportupdater 
refs/changes/15/277215/1

diff --git a/reportupdater/writer.py b/reportupdater/writer.py
index 530502c..9f24601 100644
--- a/reportupdater/writer.py
+++ b/reportupdater/writer.py
@@ -67,17 +67,24 @@
             else:
                 raise ValueError('Previous results have no header')
 
-        # NOTE: this supports moving columns and adding columns
-        #       it will rewrite the old data accordingly
+        # New results may have a different header than previous results.
+        # They may contain new columns, column order changes, or removal
+        # of some columns. In the latter case, the previous data will be
+        # kept intact and the None value will be assigned to the missing
+        # columns of the new data.
         if header != previous_header:
-            old_columns = set(header).intersection(set(previous_header))
-            removed_columns = list(set(previous_header) - set(header))
-
-            # removed columns are not supported yet
+            # Fill in the values for removed columns.
+            removed_columns = sorted(list(set(previous_header) - set(header)))
             if removed_columns:
-                raise ValueError('Results header is missing ' + 
str(removed_columns))
+                header.extend(removed_columns)
+                new_data = report.results['data']
+                for date in new_data:
+                    rows = new_data[date] if report.is_funnel else 
[new_data[date]]
+                    for row in rows:
+                        row.extend([None] * len(removed_columns))
 
             # make a map to use when updating old rows to new rows
+            old_columns = set(header).intersection(set(previous_header))
             new_indexes = {
                 header.index(col): previous_header.index(col)
                 for col in old_columns
diff --git a/test/writer_test.py b/test/writer_test.py
index a10a6d1..87fc02e 100644
--- a/test/writer_test.py
+++ b/test/writer_test.py
@@ -224,8 +224,9 @@
             'header': new_header,
             'data': {new_date : new_row}
         }
-        with self.assertRaises(ValueError):
-            self.writer.update_results(self.report)
+        header, updated_data = self.writer.update_results(self.report)
+        self.assertEqual(header, ['date', 'val1', 'val3', 'val2'])
+        self.assertEqual(updated_data[new_date], [datetime(2015, 1, 2), 1, 3, 
None])
 
 
     def test_update_results_when_header_has_different_number_of_columns(self):

-- 
To view, visit https://gerrit.wikimedia.org/r/277215
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I7df91e1164da6ea2da5d16742259ff48b0891734
Gerrit-PatchSet: 1
Gerrit-Project: analytics/reportupdater
Gerrit-Branch: master
Gerrit-Owner: Mforns <[email protected]>

_______________________________________________
MediaWiki-commits mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits

Reply via email to