o-nikolas commented on code in PR #30756:
URL: https://github.com/apache/airflow/pull/30756#discussion_r1173032048
##########
airflow/providers/amazon/aws/hooks/logs.py:
##########
@@ -25,6 +25,13 @@
from airflow.providers.amazon.aws.hooks.base_aws import AwsBaseHook
+# From AWS team the correct way to check for end of stream is that if the
value of nextForwardToken is same
+# in subsequent calls
+# The issue with this approach is, it can take a huge amount of time (e.g. 20
seconds) to retrieve logs
+# As an intermediate solution, we decided to stop fetching logs if 3
consecutive responses are empty
Review Comment:
```suggestion
# Guidance received from the AWS team regarding the correct way to check for
the end of a stream is that the value of the nextForwardToken is the same
# in subsequent calls.
# The issue with this approach is, it can take a huge amount of time (e.g.
20 seconds) to retrieve logs using this approach.
# As an intermediate solution, we decided to stop fetching logs if 3
consecutive responses are empty.
```
Some suggestions for grammar/punctuation. Feel free to use or not :) Note
that the first line may now need to be shortened a bit and some of it moved
down to the next line, but I didn't want to do that here and make the suggested
diff look more confusing.
##########
airflow/providers/amazon/aws/hooks/logs.py:
##########
@@ -96,7 +104,15 @@ def get_log_events(
yield from events
- if next_token != response["nextForwardToken"]:
+ if "nextForwardToken" not in response:
+ num_consecutive_empty_response += 1
Review Comment:
Looking at the code before, we were always having a `nextForwardToken` in
the response (otherwise we would have been getting a key error exceptions). So
I think this will rarely be true. I think what we should consider "empty" is a
response with zero events, so I'd use the condition of `if not event_count:
...` to incr `num_consecutive_empty_response` instead. WDYT?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]