rezarokni commented on a change in pull request #15585:
URL: https://github.com/apache/beam/pull/15585#discussion_r721811343



##########
File path: sdks/python/apache_beam/transforms/core.py
##########
@@ -1222,6 +1225,67 @@ def __init__(self, fn, *args, **kwargs):
     from apache_beam.runners.common import DoFnSignature
     self._signature = DoFnSignature(self.fn)
 
+  def with_dead_letters(
+      self,
+      main_tag='good',
+      dead_letter_tag='bad',
+      *,
+      exc_class=Exception,
+      partial=False,
+      use_subprocess=False,
+      threshold=.999,
+      threshold_windowing=None):
+    """Automatically provides a dead letter output for skipping bad records.
+
+    This returns a tagged output with two PCollections, the first being the
+    results of successfully processing the input PCollection, and the second
+    being the set of bad records (those which threw exceptions during
+    processing) along with information about the errors raised.
+
+    For example, one would write::
+
+        good, bad = Map(maybe_error_raising_function).with_dead_letters()
+
+    and `good` will be a PCollection of mapped records and `bad` will contain
+    those that raised exceptions.
+
+
+    Args:
+      main_tag: tag to be used for the main (good) output of the DoFn,
+          useful to avoid possible conflicts if this DoFn already produces
+          multiple outputs.  Optional, defaults to 'good'.
+      dead_letter_tag: tag to be used for the bad records, useful to avoid
+          possible conflicts if this DoFn already produces multiple outputs.
+          Optional, defaults to 'bad'.
+      exc_class: An exception class, or tuple of exception classes, to catch.
+          Optional, defaults to 'Exception'.
+      partial: Whether to emit outputs as they're produced (which could result
+          in partial outputs for a ParDo or FlatMap that throws an error part
+          way through execution) or buffer all outputs until successful
+          processing of the entire element. Optional, defaults to False.
+      use_subprocess: Whether to execute the DoFn logic in a subprocess. This
+          allows one to recover from errors that crash the process (e.g. from
+          an underlying C/C++ library), but is slower as elements and results

Review comment:
       Nice, what about adding the first part of your comment to the doc ?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to