TheNeuralBit commented on a change in pull request #15072:
URL: https://github.com/apache/beam/pull/15072#discussion_r658370349
##########
File path: sdks/python/apache_beam/dataframe/frames.py
##########
@@ -1843,9 +1873,69 @@ def repeat(self, repeats, axis):
f"DeferredSeries (encountered {type(repeats)}).")
+def _justify_str_column(objs, rjust=True):
+ strs = [str(o) for o in objs]
+ maxlen = max(len(s) for s in strs)
+ return [s.rjust(maxlen) if rjust else s.ljust(maxlen) for s in strs]
+
+
+def _ljustify_str_column(objs):
+ strs = [str(o) for o in objs]
+ maxlen = max(len(s) for s in strs)
+ return [s.ljust(maxlen) for s in strs]
+
+
+def _justify_columns_and_transpose(columns, rjust=True):
+ for row in zip(*[_justify_str_column(objs, rjust) for objs in columns]):
+ yield ' '.join(row)
+
+
@populate_not_implemented(pd.DataFrame)
@frame_base.DeferredFrame._register_for(pd.DataFrame)
class DeferredDataFrame(DeferredDataFrameOrSeries):
Review comment:
DataFrames are just concatenated Series, but they also have a common,
shared index, so that logic will only need to happen once for the DataFrame
case.
I tried to share as much code as possible by pulling out the justification
logic. It's probably possible to pull out some common logic for rendering the
index though, I'll see what I can come up with there.
WDYT about the general approach of using ":" in the columns and "??" for the
length to indicate this is a deferred object?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]