perry2of5 commented on PR #41142:
URL: https://github.com/apache/airflow/pull/41142#issuecomment-2265890223

    Oh, I see what you are getting at. I'll research as I'm not sure what the 
answer is.
   
       On Friday, August 2, 2024 at 10:14:24 AM PDT, Jarek Potiuk ***@***.***> 
wrote:  
    
    
   
   
   @potiuk commented on this pull request.
   
   In airflow/providers/microsoft/azure/operators/container_instances.py:
   > @@ -86,6 +86,12 @@ class AzureContainerInstancesOperator(BaseOperator):
        :param container_timeout: max time allowed for the execution of
            the container instance.
        :param tags: azure tags as dict of str:str
   +    :param xcom_all: Control if logs are pushed to XCOM similarly to how 
DockerOperator does.
   
   
   You can see the logic controlling this around line 316 of the operator in 
the PR.
   
   
   Yes I see [-1]. But does it mean that the whole log is kept in memory before 
doing that operation? Or is it somehow optimized under the hood to retrieve 
only last line when it is needed ? I am afraid that if you have 1GB log (which 
is not uncommon) you will download the whole log from remote service to just 
print one line. This is not only slow and will delay completion of the task a 
lot (sometimes minutes) in a way that is quite unexpected, but if the log is 
entirely loaded in memory, it means that the task will grow the memory by at 
least 1 GB - only for the purpose of pushing one line to XCo,mm.
   
   Do you know if this is - or can be - optimized @perry2of5 ?
   
   —
   Reply to this email directly, view it on GitHub, or unsubscribe.
   You are receiving this because you were mentioned.Message ID: ***@***.***>
     


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to