GuzikJakub commented on issue #4708: [AIRFLOW-3888] HA for metastore connection
URL: https://github.com/apache/airflow/pull/4708#issuecomment-463994831
 
 
   > Why did you create a new way to define multiple connections instead of 
expanding the old one? Why is your solution better? Could you complete the 
documentation?
   
   The solution prepared by me is a little better, because I am based on 
checking the correctness of the connection to the metastore during the task and 
it is not necessary to repeat it in case of hitting the offline connection. 
Secondly, I do not choose connections randomly, but only check them in 
succession, as they were entered, thanks to which I have 100% certainty of the 
task in case of at least one working connection.
   I have carried out the following test: I set the possibility of one 
additional attempt to perform a task that checks the existence of a partition 
in the hive. Next, I set 2 connections to the metastore (one online, the other 
offline). For 15 trials, only 75% of the tasks were passed correctly. On the 
other hand, more than 50% performed only during the second attempt. In my 
solution, each time it will be done during the first attempt (unless all 
connections are offline) - which may be a plus when calling such a connection 
in the middle of calculations, for example in PythonOperator (does not 
re-execute previous queries).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to