[GitHub] [beam] Edusanc95 commented on a change in pull request #15450: [BEAM-12701] Added extra parameter in to_csv for DeferredFrame to name the PTransform label

GitBox Sat, 11 Sep 2021 09:34:40 -0700


Edusanc95 commented on a change in pull request #15450:
URL: https://github.com/apache/beam/pull/15450#discussion_r706633103




##########
File path: sdks/python/apache_beam/dataframe/io.py
##########
@@ -74,16 +74,17 @@ def read_csv(path, *args, splittable=False, **kwargs):
       splitter=_CsvSplitter(args, kwargs) if splittable else None)
 
 
-def _as_pc(df):
+def _as_pc(df, label=None):
   from apache_beam.dataframe import convert  # avoid circular import
   # TODO(roberwb): Amortize the computation for multiple writes?
-  return convert.to_pcollection(df, yield_elements='pandas')
+  return convert.to_pcollection(df, yield_elements='pandas', label=label)
 
 
 @frame_base.with_docs_from(pd.DataFrame)
-def to_csv(df, path, *args, **kwargs):
-
-  return _as_pc(df) | _WriteToPandas(
+def to_csv(df, path, transform_label=None, *args, **kwargs):
+  label_pc = f"{transform_label} - ToPCollection" if transform_label else 
"ToPCollection(df)"
+  label_pd = f"{transform_label} - ToPandasDataFrame" if transform_label else 
"ToPandasDataFrame(df)"

Review comment:
       Hello! I agree with your remarks. I just pushed a commit that includes 
this change as well as the linter fix.
   
   The message is slightly different, `WriteToPandas(df) - {path}` instead of 
`{path} - WriteToPandas(df)`. I think when looking at a glance it makes more 
sense to see first the operation that's being done and afterwards the specific 
file that's being transformed.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [beam] Edusanc95 commented on a change in pull request #15450: [BEAM-12701] Added extra parameter in to_csv for DeferredFrame to name the PTransform label

Reply via email to