TheNeuralBit commented on a change in pull request #14362:
URL: https://github.com/apache/beam/pull/14362#discussion_r609079270
##########
File path: sdks/python/apache_beam/dataframe/frames.py
##########
@@ -486,11 +486,35 @@ def _get_index(self):
first = last = frame_base.wont_implement_method('order-sensitive')
head = tail = frame_base.wont_implement_method('order-sensitive')
+ interpolate = frame_base.wont_implement_method('order-sensitive')
@populate_not_implemented(pd.Series)
@frame_base.DeferredFrame._register_for(pd.Series)
class DeferredSeries(DeferredDataFrameOrSeries):
+ @property
+ def name(self):
+ return self._expr.proxy().name
Review comment:
The `proxy()` for an expression (`self._expr`) should never change.
In general the `_expr` for a `DeferredFrame` does not change, but it can
happen, for example when performing an inplace operation:
https://github.com/apache/beam/blob/a8cd05932bed9b2480316fb8518409636cb2733b/sdks/python/apache_beam/dataframe/frame_base.py#L328-L337
So if you access `name` before and after an inplace operation it will point
to a different proxy.
##########
File path: sdks/python/apache_beam/dataframe/frames.py
##########
@@ -2164,6 +2186,20 @@ def __init__(self, frame):
def names(self):
return self._frame._expr.proxy().index.names
+ @names.setter
+ def names(self, value):
+ def set_index_names(df):
+ df = df.copy()
+ df.index.names = value
+ return df
Review comment:
This is to avoid mutating the original `DataFrame`, since PCollection
elements are supposed to be immutable.
##########
File path: sdks/python/apache_beam/dataframe/frames.py
##########
@@ -470,12 +470,12 @@ def mask(self, cond, **kwargs):
def dtype(self):
return self._expr.proxy().dtype
+ isin = frame_base._elementwise_method('isin')
+
@property
def ndim(self):
return self._expr.proxy().ndim
- dtypes = dtype
-
Review comment:
Yeah thats a good question, this is misleading from this diff alone.
This is removing `DeferredDataFrameOrSeries.dtypes`, because Series isn't
supposed to have a `dtypes` accessor (only `dtype`), and we already have an
explicit one in `DeferredDataFrame`:
https://github.com/apache/beam/blob/a8cd05932bed9b2480316fb8518409636cb2733b/sdks/python/apache_beam/dataframe/frames.py#L1130-L1132
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]