[ 
https://issues.apache.org/jira/browse/BEAM-13948?focusedWorklogId=737675&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-737675
 ]

ASF GitHub Bot logged work on BEAM-13948:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 07/Mar/22 17:57
            Start Date: 07/Mar/22 17:57
    Worklog Time Spent: 10m 
      Work Description: yeandy commented on a change in pull request #17026:
URL: https://github.com/apache/beam/pull/17026#discussion_r820960011



##########
File path: sdks/python/apache_beam/dataframe/frames_test.py
##########
@@ -2468,6 +2468,113 @@ def test_split_pat_is_regex(self):
     self.assert_frame_data_equivalent(
         result, s.str.split(r"\.jpg", regex=True, expand=False))
 
+  def test_unstack_pandas_series_not_multiindex(self):
+    # Pandas should throw a ValueError if performing unstack
+    # on a Series without MultiIndex
+    s = pd.Series([1, 2, 3, 4], index=['one', 'two', 'three', 'four'])
+    with self.assertRaises((AttributeError, ValueError)):
+      self._evaluate(lambda s: s.unstack(), s)
+
+  def test_unstack_non_categorical_index(self):
+    index = pd.MultiIndex.from_tuples([('one', 'a'), ('one', 'b'), ('two', 
'a'),
+                                       ('two', 'b')])
+    index = index.set_levels(
+        index.levels[0].astype(pd.CategoricalDtype(['one', 'two'])), level=0)
+    s = pd.Series(np.arange(1.0, 5.0), index=index)
+    with self.assertRaisesRegex(
+        frame_base.WontImplementError,
+        r"unstack\(\) is only supported on DataFrames if"):
+      self._evaluate(lambda s: s.unstack(level=-1), s)
+
+  def _unstack_get_categorical_index(self):
+    index = pd.MultiIndex.from_tuples([('one', 'a'), ('one', 'b'), ('two', 
'a'),
+                                       ('two', 'b')])
+    index = index.set_levels(
+        index.levels[0].astype(pd.CategoricalDtype(['one', 'two'])), level=0)
+    index = index.set_levels(
+        index.levels[1].astype(pd.CategoricalDtype(['a', 'b'])), level=1)
+    return index
+
+  def test_unstack_pandas_example1(self):
+    index = self._unstack_get_categorical_index()
+    s = pd.Series(np.arange(1.0, 5.0), index=index)
+    result = self._evaluate(lambda s: s.unstack(level=-1), s)
+    self.assert_frame_data_equivalent(result, s.unstack(level=-1))
+
+  def test_unstack_pandas_example2(self):
+    index = self._unstack_get_categorical_index()
+    s = pd.Series(np.arange(1.0, 5.0), index=index)
+    result = self._evaluate(lambda s: s.unstack(level=0), s)
+    self.assert_frame_data_equivalent(result, s.unstack(level=0))
+
+  @unittest.skipIf(

Review comment:
       I added this because there appears to be a bug that was in an older 
version of pandas, but has since been 
[fixed](https://pandas.pydata.org/docs/dev/whatsnew/v1.2.0.html). i.e. in 
Python3.6, `pandas== 1.15` had an indexing issue that was fixed in 
`pandas==1.20`. So Python3.6 tests fail, but 3.7 and 3.8 are fine. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 737675)
    Time Spent: 1.5h  (was: 1h 20m)

> Implement DataFrame.unstack() and Series.unstack() for DataFrame API
> --------------------------------------------------------------------
>
>                 Key: BEAM-13948
>                 URL: https://issues.apache.org/jira/browse/BEAM-13948
>             Project: Beam
>          Issue Type: Sub-task
>          Components: dsl-dataframe, sdk-py-core
>            Reporter: Andy Ye
>            Assignee: Andy Ye
>            Priority: P3
>              Labels: dataframe-api
>          Time Spent: 1.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to