[
https://issues.apache.org/jira/browse/BEAM-12729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17434136#comment-17434136
]
Dylan Hercher commented on BEAM-12729:
--------------------------------------
Yes it is, also note the DataflowTemplates use a manually written version of
this in the end. The choice between DLQ and just logging is one we discussed
initially for this feature.
The question was, if we pushed the files to a DLQ PCollection than how would it
be used? Since the file is simply not readable, its going to require a user
intervention and manual fix+re-queue. The decision for now was that its just
as easy to create a metric to monitor for this log message as to write code to
handle the file (and likely do the same thing).
Open to other thoughts here. In theory you could also queue data to a separate
collection via extending the error handler
> Suppress Avro Runtime Exceptions for Streaming
> -----------------------------------------------
>
> Key: BEAM-12729
> URL: https://issues.apache.org/jira/browse/BEAM-12729
> Project: Beam
> Issue Type: Improvement
> Components: io-java-avro
> Reporter: Dylan Hercher
> Priority: P3
> Labels: streaming
> Original Estimate: 2h
> Time Spent: 40m
> Remaining Estimate: 1h 20m
>
> The current design of ReadFileRangesFn continually throws any un-recoverable
> errors until the pipeline is turned down for streaming pipelines.
>
> These invalid file errors do not have a resolvable solution and should be
> logged as errors for the files in question to allow the pipeline to continue
> progressing.
>
> As a file which cannot be read will never recover via a dead letter queue
> design. Since no recovery is possible we can simply log the errored file and
> continue processing.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)