[GitHub] [airflow] pateash commented on a change in pull request #15574: #12401 - Duplicating connections from UI

2021-05-10 Thread GitBox


pateash commented on a change in pull request #15574:
URL: https://github.com/apache/airflow/pull/15574#discussion_r629893010



##
File path: tests/www/views/test_views_connection.py
##
@@ -54,3 +54,46 @@ def test_prefill_form_null_extra():
 
 cmv = ConnectionModelView()
 cmv.prefill_form(form=mock_form, pk=1)
+
+
+def test_duplicate_connection(admin_client):
+"""Test Duplicate multiple connection with suffix"""
+conn1 = Connection(
+conn_id='test_duplicate_gcp_connection',
+conn_type='Google Cloud',
+description='Google Cloud Connection',
+)
+conn2 = Connection(
+conn_id='test_duplicate_mysql_connection',
+conn_type='FTP',
+description='MongoDB2',
+host='localhost',
+schema='airflow',
+port=3306,
+)
+conn3 = Connection(
+conn_id='test_duplicate_postgres_connection_copy1',
+conn_type='FTP',
+description='Postgres',
+host='localhost',
+schema='airflow',
+port=3306,
+)
+with create_session() as session:
+session.query(Connection).delete()
+session.add_all([conn1, conn2, conn3])
+session.commit()
+
+mock_form = mock.Mock()
+mock_form.data = {"action": "mulduplicate", "rowid": [conn1.id, conn3.id]}
+resp = admin_client.post('/connection/action_post', data=mock_form.data, 
follow_redirects=True)

Review comment:
   agree




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] uranusjr commented on a change in pull request #15574: #12401 - Duplicating connections from UI

2021-05-10 Thread GitBox


uranusjr commented on a change in pull request #15574:
URL: https://github.com/apache/airflow/pull/15574#discussion_r629886591



##
File path: tests/www/views/test_views_connection.py
##
@@ -54,3 +54,46 @@ def test_prefill_form_null_extra():
 
 cmv = ConnectionModelView()
 cmv.prefill_form(form=mock_form, pk=1)
+
+
+def test_duplicate_connection(admin_client):
+"""Test Duplicate multiple connection with suffix"""
+conn1 = Connection(
+conn_id='test_duplicate_gcp_connection',
+conn_type='Google Cloud',
+description='Google Cloud Connection',
+)
+conn2 = Connection(
+conn_id='test_duplicate_mysql_connection',
+conn_type='FTP',
+description='MongoDB2',
+host='localhost',
+schema='airflow',
+port=3306,
+)
+conn3 = Connection(
+conn_id='test_duplicate_postgres_connection_copy1',
+conn_type='FTP',
+description='Postgres',
+host='localhost',
+schema='airflow',
+port=3306,
+)
+with create_session() as session:
+session.query(Connection).delete()
+session.add_all([conn1, conn2, conn3])
+session.commit()
+
+mock_form = mock.Mock()
+mock_form.data = {"action": "mulduplicate", "rowid": [conn1.id, conn3.id]}
+resp = admin_client.post('/connection/action_post', data=mock_form.data, 
follow_redirects=True)

Review comment:
   Is `mock_form` needed? It doesn’t seem to be used anywhere.
   
   It seems only this is needed:
   
   ```python
   data = {"action": "mulduplicate", "rowid": [conn1.id, conn3.id]}
   resp = admin_client.post('/connection/action_post', data=data, 
follow_redirects=True)
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] uranusjr commented on a change in pull request #15574: #12401 - Duplicating connections from UI

2021-05-10 Thread GitBox


uranusjr commented on a change in pull request #15574:
URL: https://github.com/apache/airflow/pull/15574#discussion_r629885608



##
File path: tests/www/views/test_views_connection.py
##
@@ -54,3 +54,46 @@ def test_prefill_form_null_extra():
 
 cmv = ConnectionModelView()
 cmv.prefill_form(form=mock_form, pk=1)
+
+
+def test_duplicate_connection(admin_client):
+"""Test Duplicate multiple connection with suffix"""
+conn1 = Connection(
+conn_id='test_duplicate_gcp_connection',
+conn_type='Google Cloud',
+description='Google Cloud Connection',
+)
+conn2 = Connection(
+conn_id='test_duplicate_mysql_connection',
+conn_type='FTP',
+description='MongoDB2',
+host='localhost',
+schema='airflow',
+port=3306,
+)
+conn3 = Connection(
+conn_id='test_duplicate_postgres_connection_copy1',
+conn_type='FTP',
+description='Postgres',
+host='localhost',
+schema='airflow',
+port=3306,
+)
+with create_session() as session:
+session.query(Connection).delete()
+session.add_all([conn1, conn2, conn3])
+session.commit()

Review comment:
   This part above should be a fixture since it’s test setup, not the 
actual test.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] edwinkong commented on issue #15773: [SambaHook] Can not extract domain from the username

2021-05-10 Thread GitBox


edwinkong commented on issue #15773:
URL: https://github.com/apache/airflow/issues/15773#issuecomment-837911288


   I am sorry for not checking the upstream before opening an issue here. The 
upstream project `pysmbclient` was last updated at 9 years ago so the project 
is very likely to be obsolete. The `NT_STATUS_BAD_NETWORK_NAME` error was 
raised due to the faulty `subprocess` call
   ```
   smbclient [SHARE] -I [IP] -A [AUTH_FILE] -c ls "[PATH]"
   ```
   the double quote doesn't include the `ls` so it fails.
   
   I would create my samba hook myself. Moreover, the `SambaHook` can not even 
list the directory, so probably some deprecation warning should be displayed 
when installing the package.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] pateash commented on pull request #15574: #12401 - Duplicating connections from UI

2021-05-10 Thread GitBox


pateash commented on pull request #15574:
URL: https://github.com/apache/airflow/pull/15574#issuecomment-837897954


   @ryanahamilton @ashb 
   Could you please review and let me know if any more changes required.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] pateash commented on issue #12401: Duplicate connections UI

2021-05-10 Thread GitBox


pateash commented on issue #12401:
URL: https://github.com/apache/airflow/issues/12401#issuecomment-837896853


   Implemented, multi-select and duplication.
   
   
![](https://user-images.githubusercontent.com/16856802/116422002-f5d88f00-a85c-11eb-99d5-e9648a6b8368.png)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] boring-cyborg[bot] commented on issue #15773: [SambaHook] Can not extract domain from the username

2021-05-10 Thread GitBox


boring-cyborg[bot] commented on issue #15773:
URL: https://github.com/apache/airflow/issues/15773#issuecomment-837807022


   Thanks for opening your first issue here! Be sure to follow the issue 
template!
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] edwinkong opened a new issue #15773: [SambaHook] Can not extract domain from the username

2021-05-10 Thread GitBox


edwinkong opened a new issue #15773:
URL: https://github.com/apache/airflow/issues/15773


   
   
   
   
   **Apache Airflow version**:
   2.0.1
   
   **Kubernetes version (if you are using kubernetes)** (use `kubectl version`):
   
   **Environment**: docker-compose + custom built docker image
   
   **What happened**:
   
   The `airflow.providers.samba.hooks.samba.SambaHook` does not accept the 
`domain` argument and is automatically defaults to the server ip address by 
upstream `pysmbclient`.
   If the DOMAIN/USERNAME format was supplied, it would fail with 
`NT_STATUS_BAD_NETWORK_NAME` error. If only the USERNAME was supplied, then the 
login would fail with `NT_STATUS_LOGON_FAILURE` error.
   
   **What you expected to happen**:
   The domain should be extracted from the username if the user supplied the 
`SambaHook.conn.login` in the DOMAIN/USERNAME format.
   
   **How to reproduce it**:
   Just connect to a samba drive with domain authentication with `SambaHook`
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] vikramcse commented on issue #15524: Improve test coverage of task_command

2021-05-10 Thread GitBox


vikramcse commented on issue #15524:
URL: https://github.com/apache/airflow/issues/15524#issuecomment-837783667


   Raised a PR for improving test coverage of task_command.py file
   https://github.com/apache/airflow/pull/15760


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] dstandish edited a comment on pull request #14521: Add Asana Provider

2021-05-10 Thread GitBox


dstandish edited a comment on pull request #14521:
URL: https://github.com/apache/airflow/pull/14521#issuecomment-837774480


   seems that the recent change to `get_all_changes_for_package` in docs 
generation may not handle the "new provider" case?
   
   ```
   File "/opt/airflow/dev/provider_packages/prepare_provider_packages.py", 
line 1464, in get_all_changes_for_package
 return True, array_of_changes[0], changes_table
 IndexError: list index out of range
   ```
   
   
https://github.com/apache/airflow/pull/14521/checks?check_run_id=2551909965#step:6:894


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] dstandish commented on pull request #14521: Add Asana Provider

2021-05-10 Thread GitBox


dstandish commented on pull request #14521:
URL: https://github.com/apache/airflow/pull/14521#issuecomment-837774480


   seems that the recent change to `get_all_changes_for_package` in docs 
generation may not handle the "new provider" case?
   
   ```
   File "/opt/airflow/dev/provider_packages/prepare_provider_packages.py", 
line 1464, in get_all_changes_for_package
 return True, array_of_changes[0], changes_table
 IndexError: list index out of range
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] dstandish opened a new pull request #15772: Fix OdbcHook handling of port

2021-05-10 Thread GitBox


dstandish opened a new pull request #15772:
URL: https://github.com/apache/airflow/pull/15772


   When port was set in connection, it was not added to connection string.  
This PR fixes that.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] ephraimbuddy commented on a change in pull request #15726: Add more integration tests to the kubernetes executor

2021-05-10 Thread GitBox


ephraimbuddy commented on a change in pull request #15726:
URL: https://github.com/apache/airflow/pull/15726#discussion_r629804088



##
File path: kubernetes_tests/test_kubernetes_executor.py
##
@@ -266,3 +266,65 @@ def test_integration_run_dag_with_scheduler_failure(self):
 )
 
 assert self._num_pods_in_namespace('test-namespace') == 0, "failed to 
delete pods in other namespace"
+
+def test_integration_run_dag_with_running_task_pod_kill(self):

Review comment:
   Done




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] EunSeop commented on issue #15196: ExternalTaskSensor not working python3.6 & airflow2.0.1

2021-05-10 Thread GitBox


EunSeop commented on issue #15196:
URL: https://github.com/apache/airflow/issues/15196#issuecomment-837608306


   @DenisOgr I never saw that message. Airflow 2.0 is not perfectly support 
python 3.6, so I decided upgrade python version using pyenv. Please check 
airflow and python versions. 
   And to make run ExternalTaskSensor, if we decided use 
`schedule_interval='@weekly'`, then `start_date` should be over 7 days ago to 
make run immediately. Or can use airflow dags cli to run 2 dags at the same 
time. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] kaxil commented on a change in pull request #15602: Update KylinHook docstring

2021-05-10 Thread GitBox


kaxil commented on a change in pull request #15602:
URL: https://github.com/apache/airflow/pull/15602#discussion_r629777234



##
File path: airflow/providers/apache/kylin/hooks/kylin.py
##
@@ -26,6 +26,8 @@
 
 class KylinHook(BaseHook):
 """
+Interact with Kylin to run CubeSource commands and get job status.
+
 :param kylin_conn_id: The connection id as configured in Airflow 
administration.

Review comment:
   ```suggestion
   Interact with Kylin to run CubeSource commands and get job status.
   
   :param kylin_conn_id: The connection id as configured in Airflow 
administration.
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] kaxil commented on pull request #15719: Refactor tests/www/test_views.py (Part 2/2)

2021-05-10 Thread GitBox


kaxil commented on pull request #15719:
URL: https://github.com/apache/airflow/pull/15719#issuecomment-837578992


   @uranusjr There are some conflicts -- can you fix them please


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] kaxil commented on pull request #15695: Add integration test for other executors in kubernetes cluster

2021-05-10 Thread GitBox


kaxil commented on pull request #15695:
URL: https://github.com/apache/airflow/pull/15695#issuecomment-837577849


   @ephraimbuddy We should also now integrate this in CI to run Helm Chart 
integration test with Celery & Local Executor too


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] ferruzzi commented on pull request #15727: templatize_awsconnectionid

2021-05-10 Thread GitBox


ferruzzi commented on pull request #15727:
URL: https://github.com/apache/airflow/pull/15727#issuecomment-837577270


   I'm not a Committer so I can't merge it, but other that my comments about 
style consistency - which is admittedly subjective - it looks good to me.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[airflow] branch master updated (34cf525 -> cbc3cb8)

2021-05-10 Thread kaxilnaik
This is an automated email from the ASF dual-hosted git repository.

kaxilnaik pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/airflow.git.


from 34cf525  Clean up unnecessary Airflow config in helm chart (#15729)
 add cbc3cb8  Add integration test for other executors in kubernetes 
cluster (#15695)

No new revisions were added by this update.

Summary of changes:
 TESTING.rst|  23 +++
 .../{test_kubernetes_executor.py => test_base.py}  |  69 +--
 kubernetes_tests/test_kubernetes_executor.py   | 206 ++---
 kubernetes_tests/test_other_executors.py   |  89 +
 4 files changed, 129 insertions(+), 258 deletions(-)
 copy kubernetes_tests/{test_kubernetes_executor.py => test_base.py} (80%)
 create mode 100644 kubernetes_tests/test_other_executors.py


[GitHub] [airflow] kaxil merged pull request #15695: Add integration test for other executors in kubernetes cluster

2021-05-10 Thread GitBox


kaxil merged pull request #15695:
URL: https://github.com/apache/airflow/pull/15695


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] kaxil commented on a change in pull request #15726: Add more integration tests to the kubernetes executor

2021-05-10 Thread GitBox


kaxil commented on a change in pull request #15726:
URL: https://github.com/apache/airflow/pull/15726#discussion_r629773272



##
File path: kubernetes_tests/test_kubernetes_executor.py
##
@@ -266,3 +266,65 @@ def test_integration_run_dag_with_scheduler_failure(self):
 )
 
 assert self._num_pods_in_namespace('test-namespace') == 0, "failed to 
delete pods in other namespace"
+
+def test_integration_run_dag_with_running_task_pod_kill(self):

Review comment:
   Can you add comments / docstrings to say what this test is actually 
testing, please

##
File path: kubernetes_tests/test_kubernetes_executor.py
##
@@ -266,3 +266,65 @@ def test_integration_run_dag_with_scheduler_failure(self):
 )
 
 assert self._num_pods_in_namespace('test-namespace') == 0, "failed to 
delete pods in other namespace"
+
+def test_integration_run_dag_with_running_task_pod_kill(self):
+host = KUBERNETES_HOST_PORT
+dag_id = 'example_kubernetes_executor_config'
+pod_name = 'examplekubernetesexecutorconfigstarttask'
+dag_run_id, execution_date = self.start_dag(dag_id=dag_id, host=host)
+self._delete_task_pod(pod_name)
+self.monitor_task(
+host=host,
+dag_run_id=dag_run_id,
+dag_id=dag_id,
+task_id='start_task',
+expected_final_state='failed',
+timeout=300,
+)
+self.monitor_task(
+host=host,
+dag_run_id=dag_run_id,
+dag_id=dag_id,
+task_id='other_namespace_task',
+expected_final_state='upstream_failed',
+timeout=300,
+)
+self.ensure_dag_expected_state(
+host=host,
+execution_date=execution_date,
+dag_id=dag_id,
+expected_final_state='failed',
+timeout=300,
+)
+assert self._num_pods_in_namespace('test-namespace') == 0, "failed to 
delete pods in other namespace"
+
+def test_integration_run_dag_with_container_creating_task_pod_kill(self):

Review comment:
   same: Can you add comments / docstrings to say what this test is 
actually testing, please




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] khilawar4 commented on a change in pull request #15727: templatize_awsconnectionid

2021-05-10 Thread GitBox


khilawar4 commented on a change in pull request #15727:
URL: https://github.com/apache/airflow/pull/15727#discussion_r629762975



##
File path: airflow/providers/amazon/aws/operators/s3_list.py
##
@@ -65,7 +67,7 @@ class S3ListOperator(BaseOperator):
 )
 """
 
-template_fields: Iterable[str] = ('bucket', 'prefix', 'delimiter')
+template_fields: Iterable[str] = ('bucket', 'prefix', 'delimiter' , 
'aws_conn_id')

Review comment:
   Well it was commited by someone else before so i can't comment on that , 
but i agree , we should standardize 'type hinting'. May be i can work on that 
as part of separate PR.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[airflow] branch master updated (3edcd12 -> 34cf525)

2021-05-10 Thread kaxilnaik
This is an automated email from the ASF dual-hosted git repository.

kaxilnaik pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/airflow.git.


from 3edcd12  Update croniter to 1.0.x series (#15769)
 add 34cf525  Clean up unnecessary Airflow config in helm chart (#15729)

No new revisions were added by this update.

Summary of changes:
 chart/values.yaml | 7 ++-
 1 file changed, 2 insertions(+), 5 deletions(-)


[GitHub] [airflow] kaxil merged pull request #15729: Clean up unnecessary Airflow config in helm chart

2021-05-10 Thread GitBox


kaxil merged pull request #15729:
URL: https://github.com/apache/airflow/pull/15729


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] khilawar4 commented on a change in pull request #15727: templatize_awsconnectionid

2021-05-10 Thread GitBox


khilawar4 commented on a change in pull request #15727:
URL: https://github.com/apache/airflow/pull/15727#discussion_r629770582



##
File path: airflow/providers/amazon/aws/operators/s3_delete_objects.py
##
@@ -45,6 +45,8 @@ class S3DeleteObjectsOperator(BaseOperator):
 :type prefix: str
 :param aws_conn_id: Connection id of the S3 connection to use
 :type aws_conn_id: str
+(Adding `aws_conn_id` param to template_fields so that it can be 
overriden using jinja template from Dags
+ This feature can be useful if user wants to update/override the 
aws_conn_id for some kind of Dag Isolation etc)

Review comment:
   yes you are right , i forgotten in first commit then later realized and 
added as part of the second commit..

##
File path: airflow/providers/amazon/aws/operators/emr_create_job_flow.py
##
@@ -40,7 +42,7 @@ class EmrCreateJobFlowOperator(BaseOperator):
 :type region_name: Optional[str]
 """
 
-template_fields = ['job_flow_overrides']
+template_fields = ['job_flow_overrides' , 'aws_conn_id']

Review comment:
   fixed and commited back.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[airflow] branch master updated (25caeda -> 3edcd12)

2021-05-10 Thread kaxilnaik
This is an automated email from the ASF dual-hosted git repository.

kaxilnaik pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/airflow.git.


from 25caeda  Remove unused dependency (#15762)
 add 3edcd12  Update croniter to 1.0.x series (#15769)

No new revisions were added by this update.

Summary of changes:
 setup.cfg | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


[GitHub] [airflow] kaxil merged pull request #15769: Update croniter to 1.0.x series

2021-05-10 Thread GitBox


kaxil merged pull request #15769:
URL: https://github.com/apache/airflow/pull/15769


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] khilawar4 commented on a change in pull request #15727: templatize_awsconnectionid

2021-05-10 Thread GitBox


khilawar4 commented on a change in pull request #15727:
URL: https://github.com/apache/airflow/pull/15727#discussion_r629762975



##
File path: airflow/providers/amazon/aws/operators/s3_list.py
##
@@ -65,7 +67,7 @@ class S3ListOperator(BaseOperator):
 )
 """
 
-template_fields: Iterable[str] = ('bucket', 'prefix', 'delimiter')
+template_fields: Iterable[str] = ('bucket', 'prefix', 'delimiter' , 
'aws_conn_id')

Review comment:
   Well it was done by some else before so i can't comment on that , but i 
agree , we should standardize type hinting. May i can work on that as part of 
separate PR.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] khilawar4 commented on a change in pull request #15727: templatize_awsconnectionid

2021-05-10 Thread GitBox


khilawar4 commented on a change in pull request #15727:
URL: https://github.com/apache/airflow/pull/15727#discussion_r629762502



##
File path: airflow/providers/amazon/aws/operators/s3_bucket.py
##
@@ -37,12 +37,16 @@ class S3CreateBucketOperator(BaseOperator):
 running Airflow in a distributed manner and aws_conn_id is None or
 empty, then default boto3 configuration would be used (and must be
 maintained on each worker node).
+

Review comment:
   Its good idea to give some info as part of the comments so that users 
are aware why aws_conn_id was added as part of the template_fields and what 
other benefit it could have.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] khilawar4 commented on a change in pull request #15727: templatize_awsconnectionid

2021-05-10 Thread GitBox


khilawar4 commented on a change in pull request #15727:
URL: https://github.com/apache/airflow/pull/15727#discussion_r629761961



##
File path: airflow/providers/amazon/aws/operators/cloud_formation.py
##
@@ -67,9 +67,11 @@ class CloudFormationDeleteStackOperator(BaseOperator):
 :type params: dict
 :param aws_conn_id: aws connection to uses
 :type aws_conn_id: str
+ (Adding this param to template_fields so that it can be overriden 
using jinja template from Dags
+ This feature can be useful if user wants to update/override the 
aws_conn_id for some kind of Dag Isolation etc)
 """
 
-template_fields: List[str] = ['stack_name']
+template_fields: List[str] = ['stack_name' , 'aws_conn_id']

Review comment:
   we can add , but wanted to make sure that this PR is merged as this PR 
has most of the popular operators. I can open another PR for remaining of the 
less used operators.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] khilawar4 commented on a change in pull request #15727: templatize_awsconnectionid

2021-05-10 Thread GitBox


khilawar4 commented on a change in pull request #15727:
URL: https://github.com/apache/airflow/pull/15727#discussion_r629760514



##
File path: airflow/providers/amazon/aws/operators/emr_modify_cluster.py
##
@@ -32,6 +32,8 @@ class EmrModifyClusterOperator(BaseOperator):
 :type step_concurrency_level: int
 :param aws_conn_id: aws connection to uses
 :type aws_conn_id: str
+(Adding `aws_conn_id` param to template_fields so that it can be 
overriden using jinja template from Dags
+ This feature can be useful if user wants to update/override the 
aws_conn_id for some kind of Dag Isolation etc)

Review comment:
   its added.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] khilawar4 commented on a change in pull request #15727: templatize_awsconnectionid

2021-05-10 Thread GitBox


khilawar4 commented on a change in pull request #15727:
URL: https://github.com/apache/airflow/pull/15727#discussion_r629760349



##
File path: airflow/providers/amazon/aws/operators/cloud_formation.py
##
@@ -67,9 +67,11 @@ class CloudFormationDeleteStackOperator(BaseOperator):
 :type params: dict
 :param aws_conn_id: aws connection to uses
 :type aws_conn_id: str
+ (Adding this param to template_fields so that it can be overriden 
using jinja template from Dags
+ This feature can be useful if user wants to update/override the 
aws_conn_id for some kind of Dag Isolation etc)
 """
 
-template_fields: List[str] = ['stack_name']
+template_fields: List[str] = ['stack_name' , 'aws_conn_id']

Review comment:
   its added




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] github-actions[bot] commented on pull request #13832: removing try-catch block to fix timeout exception getting ignored in aws batch operator

2021-05-10 Thread GitBox


github-actions[bot] commented on pull request #13832:
URL: https://github.com/apache/airflow/pull/13832#issuecomment-837524537


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed in 5 days if no further activity occurs. 
Thank you for your contributions.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] github-actions[bot] commented on pull request #15012: PostgreSQL operator returns query result

2021-05-10 Thread GitBox


github-actions[bot] commented on pull request #15012:
URL: https://github.com/apache/airflow/pull/15012#issuecomment-837524479


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed in 5 days if no further activity occurs. 
Thank you for your contributions.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] kaxil opened a new issue #15771: Automate docs for `values.yaml` via pre-commit config and break them into logical groups

2021-05-10 Thread GitBox


kaxil opened a new issue #15771:
URL: https://github.com/apache/airflow/issues/15771


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] Smithx10 commented on issue #15724: mysql cluster import_error waits for a lock

2021-05-10 Thread GitBox


Smithx10 commented on issue #15724:
URL: https://github.com/apache/airflow/issues/15724#issuecomment-837479909


   @potiuk,
   
   I made a bit of progress today.  I forked the percona-healthcheck script, 
ported it to go, and added raft.  It still needs to be cleaned up,  Logging, 
Config, Documentation,  but we will be testing this week to see how it works.
   
   In a quick summary,  this healthchecker runs on 9200 and only informs 
HAProxy to route to a Synced MySQL node, and he Raft Elected leader.  If a 
Mysql node is unhealthy the raft cluster leader with initiate a transfer.   
This should make it so we don't write to Non Sync'd nodes and more than 1 at a 
time.
   
   I attempted using stick-tables, backup, and a few other haproxy settings, 
but ran into an issue where I would need to use a VIP so that the routing table 
would be the same cross nodes and we would always route the same. 
   
   
https://github.com/olafz/percona-clustercheck/compare/master...Smithx10:golang-raft?expand=1
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] kaxil closed issue #14302: Helm Chart: Review Architecture

2021-05-10 Thread GitBox


kaxil closed issue #14302:
URL: https://github.com/apache/airflow/issues/14302


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] dstandish opened a new issue #15770: google provider requires leveldb in host

2021-05-10 Thread GitBox


dstandish opened a new issue #15770:
URL: https://github.com/apache/airflow/issues/15770


   google provider now requires plyvel, which is used by leveldb hook.
   
   to install plyvel, the user needs leveldb headers installed in the system, 
otherwise it will fail:
   
   ```
 plyvel/_plyvel.cpp:632:10: fatal error: 'leveldb/db.h' file not found
 #include "leveldb/db.h"
  ^~
 1 error generated.
 error: command 'gcc' failed with exit status 1
 
 ERROR: Failed building wheel for plyvel
   ```
   
   the average GCP user may not use leveldb and it seems a little burdernsome 
to requirem its installation just to use GCP hooks.
   
   perhaps there is a way to make this dependency optional?
   
   or perhaps is there a way to install only certain components of a provider?
   
   cc: @mik-laj 
   related: https://github.com/apache/airflow/pull/14105


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] kaxil commented on a change in pull request #9973: Improve compatibility with mssql

2021-05-10 Thread GitBox


kaxil commented on a change in pull request #9973:
URL: https://github.com/apache/airflow/pull/9973#discussion_r629703321



##
File path: 
airflow/migrations/versions/83f031fd9f1c_improve_mssql_compatibility.py
##
@@ -0,0 +1,256 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""improve mssql compatibility
+
+Revision ID: 83f031fd9f1c
+Revises: a13f7613ad25
+Create Date: 2021-04-06 12:22:02.197726
+
+"""
+
+from collections import defaultdict
+
+import sqlalchemy as sa
+from alembic import op
+from sqlalchemy.dialects import mssql
+
+# revision identifiers, used by Alembic.
+revision = '83f031fd9f1c'
+down_revision = 'a13f7613ad25'
+branch_labels = None
+depends_on = None
+
+
+def is_table_empty(conn, table_name):
+"""
+This function checks if the mssql table is empty
+:param conn: sql connection object

Review comment:
   ```suggestion
   This function checks if the MS SQL table is empty
   
   :param conn: SQL connection object
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] kaxil commented on a change in pull request #9973: Improve compatibility with mssql

2021-05-10 Thread GitBox


kaxil commented on a change in pull request #9973:
URL: https://github.com/apache/airflow/pull/9973#discussion_r629701355



##
File path: 
airflow/migrations/versions/bbf4a7ad0465_remove_id_column_from_xcom.py
##
@@ -35,6 +37,61 @@
 depends_on = None
 
 
+def get_table_constraints(conn, table_name):
+"""
+This function return primary and unique constraint
+along with column name. some tables like task_instance
+is missing primary key constraint name and the name is
+auto-generated by sql server. so this function helps to
+retrieve any primary or unique constraint name.
+:param conn: sql connection object
+:param table_name: table name
+:return: a dictionary of ((constraint name, constraint type), column name) 
of table
+:rtype: defaultdict(list)
+"""

Review comment:
   ```suggestion
   This function return primary and unique constraint
   along with column name. Some tables like `task_instance`
   is missing the primary key constraint name and the name is
   auto-generated by the SQL server. so this function helps to
   retrieve any primary or unique constraint name.
   
   :param conn: sql connection object
   :param table_name: table name
   :return: a dictionary of ((constraint name, constraint type), column 
name) of table
   :rtype: defaultdict(list)
   """
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] kaxil commented on a change in pull request #9973: Improve compatibility with mssql

2021-05-10 Thread GitBox


kaxil commented on a change in pull request #9973:
URL: https://github.com/apache/airflow/pull/9973#discussion_r629692470



##
File path: 
airflow/migrations/versions/bbf4a7ad0465_remove_id_column_from_xcom.py
##
@@ -35,6 +37,61 @@
 depends_on = None
 
 
+def get_table_constraints(conn, table_name):
+"""
+This function return primary and unique constraint
+along with column name. some tables like task_instance
+is missing primary key constraint name and the name is
+auto-generated by sql server. so this function helps to
+retrieve any primary or unique constraint name.
+:param conn: sql connection object

Review comment:
   ```suggestion
   retrieve any primary or unique constraint name.
   
   :param conn: sql connection object
   ```

##
File path: 
airflow/migrations/versions/bbf4a7ad0465_remove_id_column_from_xcom.py
##
@@ -35,6 +37,61 @@
 depends_on = None
 
 
+def get_table_constraints(conn, table_name):
+"""
+This function return primary and unique constraint
+along with column name. some tables like task_instance
+is missing primary key constraint name and the name is
+auto-generated by sql server. so this function helps to
+retrieve any primary or unique constraint name.

Review comment:
   ```suggestion
   This function return primary and unique constraint
   along with column name. Some tables like `task_instance`
   is missing the primary key constraint name and the name is
   auto-generated by the SQL server. so this function helps to
   retrieve any primary or unique constraint name.
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] kaxil commented on a change in pull request #9973: Improve compatibility with mssql

2021-05-10 Thread GitBox


kaxil commented on a change in pull request #9973:
URL: https://github.com/apache/airflow/pull/9973#discussion_r629695358



##
File path: 
airflow/migrations/versions/e9304a3141f0_make_xcom_pkey_columns_non_nullable.py
##
@@ -0,0 +1,76 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""make xcom pkey columns non-nullable
+
+Revision ID: e9304a3141f0
+Revises: 83f031fd9f1c
+Create Date: 2021-04-06 13:22:02.197726
+
+"""
+import sqlalchemy as sa
+from alembic import op
+from sqlalchemy.dialects import mssql, mysql
+
+from airflow.models.base import COLLATION_ARGS
+
+# revision identifiers, used by Alembic.
+revision = 'e9304a3141f0'
+down_revision = '83f031fd9f1c'
+branch_labels = None
+depends_on = None
+
+
+def _use_date_time2(conn):
+result = conn.execute(
+"""SELECT CASE WHEN CONVERT(VARCHAR(128), SERVERPROPERTY 
('productversion'))
+like '8%' THEN '2000' WHEN CONVERT(VARCHAR(128), SERVERPROPERTY 
('productversion'))
+like '9%' THEN '2005' ELSE '2005Plus' END AS MajorVersion"""
+).fetchone()
+mssql_version = result[0]
+return mssql_version not in ("2000", "2005")
+
+
+def _get_timestamp(conn):
+dialect_name = conn.dialect.name
+if dialect_name == "mssql":
+return mssql.DATETIME2(precision=6) if _use_date_time2(conn) else 
mssql.DATETIME
+elif dialect_name != "mysql":
+return sa.TIMESTAMP(timezone=True)
+else:
+return mysql.TIMESTAMP(fsp=6, timezone=True)

Review comment:
   ```suggestion
   elif dialect_name == "mysql":
   return mysql.TIMESTAMP(fsp=6, timezone=True)
   else:
   return sa.TIMESTAMP(timezone=True)
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] mik-laj commented on pull request #14105: Add Google leveldb hook (#13109)

2021-05-10 Thread GitBox


mik-laj commented on pull request #14105:
URL: https://github.com/apache/airflow/pull/14105#issuecomment-837368681


   @dstandish can you create a issue?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] kaxil commented on a change in pull request #9973: Improve compatibility with mssql

2021-05-10 Thread GitBox


kaxil commented on a change in pull request #9973:
URL: https://github.com/apache/airflow/pull/9973#discussion_r629693212



##
File path: 
airflow/migrations/versions/bbf4a7ad0465_remove_id_column_from_xcom.py
##
@@ -35,6 +37,61 @@
 depends_on = None
 
 
+def get_table_constraints(conn, table_name):
+"""
+This function return primary and unique constraint
+along with column name. some tables like task_instance
+is missing primary key constraint name and the name is
+auto-generated by sql server. so this function helps to
+retrieve any primary or unique constraint name.
+:param conn: sql connection object
+:param table_name: table name
+:return: a dictionary of ((constraint name, constraint type), column name) 
of table
+:rtype: defaultdict(list)
+"""
+query = """SELECT tc.CONSTRAINT_NAME , tc.CONSTRAINT_TYPE, ccu.COLUMN_NAME
+ FROM INFORMATION_SCHEMA.TABLE_CONSTRAINTS AS tc
+ JOIN INFORMATION_SCHEMA.CONSTRAINT_COLUMN_USAGE AS ccu ON 
ccu.CONSTRAINT_NAME = tc.CONSTRAINT_NAME
+ WHERE tc.TABLE_NAME = '{table_name}' AND
+ (tc.CONSTRAINT_TYPE = 'PRIMARY KEY' or UPPER(tc.CONSTRAINT_TYPE) = 
'UNIQUE')
+""".format(
+table_name=table_name
+)
+result = conn.execute(query).fetchall()
+constraint_dict = defaultdict(list)
+for constraint, constraint_type, column in result:
+constraint_dict[(constraint, constraint_type)].append(column)
+return constraint_dict
+
+
+def drop_column_constraints(operator, column_name, constraint_dict):
+"""
+Drop a primary key or unique constraint
+:param operator: batch_alter_table for the table
+:param constraint_dict: a dictionary of ((constraint name, constraint 
type), column name) of table
+"""
+for constraint, columns in constraint_dict.items():
+if column_name in columns:
+if constraint[1].lower().startswith("primary"):
+operator.drop_constraint(constraint[0], type_='primary')
+elif constraint[1].lower().startswith("unique"):
+operator.drop_constraint(constraint[0], type_='unique')
+
+
+def create_constraints(operator, column_name, constraint_dict):
+"""
+Create a primary key or unique constraint
+:param operator: batch_alter_table for the table

Review comment:
   ```suggestion
   Create a primary key or unique constraint
   
   :param operator: batch_alter_table for the table
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] kaxil commented on a change in pull request #9973: Improve compatibility with mssql

2021-05-10 Thread GitBox


kaxil commented on a change in pull request #9973:
URL: https://github.com/apache/airflow/pull/9973#discussion_r629693129



##
File path: 
airflow/migrations/versions/bbf4a7ad0465_remove_id_column_from_xcom.py
##
@@ -35,6 +37,61 @@
 depends_on = None
 
 
+def get_table_constraints(conn, table_name):
+"""
+This function return primary and unique constraint
+along with column name. some tables like task_instance
+is missing primary key constraint name and the name is
+auto-generated by sql server. so this function helps to
+retrieve any primary or unique constraint name.
+:param conn: sql connection object
+:param table_name: table name
+:return: a dictionary of ((constraint name, constraint type), column name) 
of table
+:rtype: defaultdict(list)
+"""
+query = """SELECT tc.CONSTRAINT_NAME , tc.CONSTRAINT_TYPE, ccu.COLUMN_NAME
+ FROM INFORMATION_SCHEMA.TABLE_CONSTRAINTS AS tc
+ JOIN INFORMATION_SCHEMA.CONSTRAINT_COLUMN_USAGE AS ccu ON 
ccu.CONSTRAINT_NAME = tc.CONSTRAINT_NAME
+ WHERE tc.TABLE_NAME = '{table_name}' AND
+ (tc.CONSTRAINT_TYPE = 'PRIMARY KEY' or UPPER(tc.CONSTRAINT_TYPE) = 
'UNIQUE')
+""".format(
+table_name=table_name
+)
+result = conn.execute(query).fetchall()
+constraint_dict = defaultdict(list)
+for constraint, constraint_type, column in result:
+constraint_dict[(constraint, constraint_type)].append(column)
+return constraint_dict
+
+
+def drop_column_constraints(operator, column_name, constraint_dict):
+"""
+Drop a primary key or unique constraint
+:param operator: batch_alter_table for the table

Review comment:
   ```suggestion
   Drop a primary key or unique constraint
   
   :param operator: batch_alter_table for the table
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] kaxil commented on a change in pull request #9973: Improve compatibility with mssql

2021-05-10 Thread GitBox


kaxil commented on a change in pull request #9973:
URL: https://github.com/apache/airflow/pull/9973#discussion_r629693037



##
File path: 
airflow/migrations/versions/bbf4a7ad0465_remove_id_column_from_xcom.py
##
@@ -35,6 +37,61 @@
 depends_on = None
 
 
+def get_table_constraints(conn, table_name):
+"""
+This function return primary and unique constraint
+along with column name. some tables like task_instance
+is missing primary key constraint name and the name is
+auto-generated by sql server. so this function helps to
+retrieve any primary or unique constraint name.

Review comment:
   ```suggestion
   This function return primary and unique constraint
   along with column name. Some tables like `task_instance`
   is missing the primary key constraint name and the name is
   auto-generated by the SQL server. so this function helps to
   retrieve any primary or unique constraint name.
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] kaxil commented on a change in pull request #9973: Improve compatibility with mssql

2021-05-10 Thread GitBox


kaxil commented on a change in pull request #9973:
URL: https://github.com/apache/airflow/pull/9973#discussion_r629692470



##
File path: 
airflow/migrations/versions/bbf4a7ad0465_remove_id_column_from_xcom.py
##
@@ -35,6 +37,61 @@
 depends_on = None
 
 
+def get_table_constraints(conn, table_name):
+"""
+This function return primary and unique constraint
+along with column name. some tables like task_instance
+is missing primary key constraint name and the name is
+auto-generated by sql server. so this function helps to
+retrieve any primary or unique constraint name.
+:param conn: sql connection object

Review comment:
   ```suggestion
   retrieve any primary or unique constraint name.
   
   :param conn: sql connection object
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] kaxil commented on a change in pull request #9973: Improve compatibility with mssql

2021-05-10 Thread GitBox


kaxil commented on a change in pull request #9973:
URL: https://github.com/apache/airflow/pull/9973#discussion_r629692289



##
File path: 
airflow/migrations/versions/98271e7606e2_add_scheduling_decision_to_dagrun_and_.py
##
@@ -35,12 +35,32 @@
 depends_on = None
 
 
+def __use_date_time2(conn):

Review comment:
   This is still unresolved 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] kaxil commented on a change in pull request #9973: Improve compatibility with mssql

2021-05-10 Thread GitBox


kaxil commented on a change in pull request #9973:
URL: https://github.com/apache/airflow/pull/9973#discussion_r629691901



##
File path: 
airflow/migrations/versions/83f031fd9f1c_improve_mssql_compatibility.py
##
@@ -0,0 +1,256 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""improve mssql compatibility
+
+Revision ID: 83f031fd9f1c
+Revises: a13f7613ad25
+Create Date: 2021-04-06 12:22:02.197726
+
+"""
+
+from collections import defaultdict
+
+import sqlalchemy as sa
+from alembic import op
+from sqlalchemy.dialects import mssql
+
+# revision identifiers, used by Alembic.
+revision = '83f031fd9f1c'
+down_revision = 'a13f7613ad25'
+branch_labels = None
+depends_on = None
+
+
+def is_table_empty(conn, table_name):
+"""
+This function checks if the mssql table is empty
+:param conn: sql connection object
+:param table_name: table name
+:return: Booelan indicating if the table is present
+"""
+return conn.execute(f'select TOP 1 * from {table_name}').first() is None
+
+
+def get_table_constraints(conn, table_name):
+"""
+This function return primary and unique constraint
+along with column name. some tables like task_instance
+is missing primary key constraint name and the name is
+auto-generated by sql server. so this function helps to
+retrieve any primary or unique constraint name.
+
+:param conn: sql connection object
+:param table_name: table name
+:return: a dictionary of ((constraint name, constraint type), column name) 
of table
+:rtype: defaultdict(list)
+"""
+query = """SELECT tc.CONSTRAINT_NAME , tc.CONSTRAINT_TYPE, ccu.COLUMN_NAME
+ FROM INFORMATION_SCHEMA.TABLE_CONSTRAINTS AS tc
+ JOIN INFORMATION_SCHEMA.CONSTRAINT_COLUMN_USAGE AS ccu ON 
ccu.CONSTRAINT_NAME = tc.CONSTRAINT_NAME
+ WHERE tc.TABLE_NAME = '{table_name}' AND
+ (tc.CONSTRAINT_TYPE = 'PRIMARY KEY' or UPPER(tc.CONSTRAINT_TYPE) = 
'UNIQUE')
+""".format(
+table_name=table_name
+)
+result = conn.execute(query).fetchall()
+constraint_dict = defaultdict(list)
+for constraint, constraint_type, column in result:
+constraint_dict[(constraint, constraint_type)].append(column)
+return constraint_dict
+
+
+def drop_column_constraints(operator, column_name, constraint_dict):
+"""
+Drop a primary key or unique constraint
+
+:param operator: batch_alter_table for the table
+:param constraint_dict: a dictionary of ((constraint name, constraint 
type), column name) of table
+"""
+for constraint, columns in constraint_dict.items():
+if column_name in columns:
+if constraint[1].lower().startswith("primary"):
+operator.drop_constraint(constraint[0], type_='primary')
+elif constraint[1].lower().startswith("unique"):
+operator.drop_constraint(constraint[0], type_='unique')
+
+
+def create_constraints(operator, column_name, constraint_dict):
+"""
+Create a primary key or unique constraint
+
+:param operator: batch_alter_table for the table
+:param constraint_dict: a dictionary of ((constraint name, constraint 
type), column name) of table
+"""
+for constraint, columns in constraint_dict.items():
+if column_name in columns:
+if constraint[1].lower().startswith("primary"):
+operator.create_primary_key(constraint_name=constraint[0], 
columns=columns)
+elif constraint[1].lower().startswith("unique"):
+
operator.create_unique_constraint(constraint_name=constraint[0], 
columns=columns)
+
+
+def _use_date_time2(conn):
+result = conn.execute(
+"""SELECT CASE WHEN CONVERT(VARCHAR(128), SERVERPROPERTY 
('productversion'))
+like '8%' THEN '2000' WHEN CONVERT(VARCHAR(128), SERVERPROPERTY 
('productversion'))
+like '9%' THEN '2005' ELSE '2005Plus' END AS MajorVersion"""
+).fetchone()
+mssql_version = result[0]
+return mssql_version not in ("2000", "2005")
+
+
+def _is_timestamp(conn, table_name, column_name):
+query = f"""SELECT
+TYPE_NAME(C.USER_TYPE_ID) AS DATA_TYPE
+FROM SYS.COLUMNS C
+JOIN SYS.TYPES T
+ON C.USER_TYPE_ID=T.USER_TYPE_ID
+

[GitHub] [airflow] kaxil commented on a change in pull request #9973: Improve compatibility with mssql

2021-05-10 Thread GitBox


kaxil commented on a change in pull request #9973:
URL: https://github.com/apache/airflow/pull/9973#discussion_r629691560



##
File path: 
airflow/migrations/versions/83f031fd9f1c_improve_mssql_compatibility.py
##
@@ -0,0 +1,256 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""improve mssql compatibility
+
+Revision ID: 83f031fd9f1c
+Revises: a13f7613ad25
+Create Date: 2021-04-06 12:22:02.197726
+
+"""
+
+from collections import defaultdict
+
+import sqlalchemy as sa
+from alembic import op
+from sqlalchemy.dialects import mssql
+
+# revision identifiers, used by Alembic.
+revision = '83f031fd9f1c'
+down_revision = 'a13f7613ad25'
+branch_labels = None
+depends_on = None
+
+
+def is_table_empty(conn, table_name):
+"""
+This function checks if the mssql table is empty
+:param conn: sql connection object
+:param table_name: table name
+:return: Booelan indicating if the table is present
+"""
+return conn.execute(f'select TOP 1 * from {table_name}').first() is None
+
+
+def get_table_constraints(conn, table_name):
+"""
+This function return primary and unique constraint
+along with column name. some tables like task_instance
+is missing primary key constraint name and the name is
+auto-generated by sql server. so this function helps to
+retrieve any primary or unique constraint name.
+
+:param conn: sql connection object
+:param table_name: table name
+:return: a dictionary of ((constraint name, constraint type), column name) 
of table
+:rtype: defaultdict(list)
+"""
+query = """SELECT tc.CONSTRAINT_NAME , tc.CONSTRAINT_TYPE, ccu.COLUMN_NAME
+ FROM INFORMATION_SCHEMA.TABLE_CONSTRAINTS AS tc
+ JOIN INFORMATION_SCHEMA.CONSTRAINT_COLUMN_USAGE AS ccu ON 
ccu.CONSTRAINT_NAME = tc.CONSTRAINT_NAME
+ WHERE tc.TABLE_NAME = '{table_name}' AND
+ (tc.CONSTRAINT_TYPE = 'PRIMARY KEY' or UPPER(tc.CONSTRAINT_TYPE) = 
'UNIQUE')
+""".format(
+table_name=table_name
+)
+result = conn.execute(query).fetchall()
+constraint_dict = defaultdict(list)
+for constraint, constraint_type, column in result:
+constraint_dict[(constraint, constraint_type)].append(column)
+return constraint_dict
+
+
+def drop_column_constraints(operator, column_name, constraint_dict):
+"""
+Drop a primary key or unique constraint
+
+:param operator: batch_alter_table for the table
+:param constraint_dict: a dictionary of ((constraint name, constraint 
type), column name) of table
+"""
+for constraint, columns in constraint_dict.items():
+if column_name in columns:
+if constraint[1].lower().startswith("primary"):
+operator.drop_constraint(constraint[0], type_='primary')
+elif constraint[1].lower().startswith("unique"):
+operator.drop_constraint(constraint[0], type_='unique')
+
+
+def create_constraints(operator, column_name, constraint_dict):
+"""
+Create a primary key or unique constraint
+
+:param operator: batch_alter_table for the table
+:param constraint_dict: a dictionary of ((constraint name, constraint 
type), column name) of table
+"""
+for constraint, columns in constraint_dict.items():
+if column_name in columns:
+if constraint[1].lower().startswith("primary"):
+operator.create_primary_key(constraint_name=constraint[0], 
columns=columns)
+elif constraint[1].lower().startswith("unique"):
+
operator.create_unique_constraint(constraint_name=constraint[0], 
columns=columns)
+
+
+def _use_date_time2(conn):
+result = conn.execute(
+"""SELECT CASE WHEN CONVERT(VARCHAR(128), SERVERPROPERTY 
('productversion'))
+like '8%' THEN '2000' WHEN CONVERT(VARCHAR(128), SERVERPROPERTY 
('productversion'))
+like '9%' THEN '2005' ELSE '2005Plus' END AS MajorVersion"""
+).fetchone()
+mssql_version = result[0]
+return mssql_version not in ("2000", "2005")
+
+
+def _is_timestamp(conn, table_name, column_name):
+query = f"""SELECT
+TYPE_NAME(C.USER_TYPE_ID) AS DATA_TYPE
+FROM SYS.COLUMNS C
+JOIN SYS.TYPES T
+ON C.USER_TYPE_ID=T.USER_TYPE_ID
+

[GitHub] [airflow] dstandish edited a comment on pull request #14105: Add Google leveldb hook (#13109)

2021-05-10 Thread GitBox


dstandish edited a comment on pull request #14105:
URL: https://github.com/apache/airflow/pull/14105#issuecomment-837350318


   with this change it seems leveldb is a required system dependency for the 
google provider. 
   
   it doesn't seem right to require users to install leveldb on their machines 
just to use GCP hooks.  anyone have any thoughts on this?
   
   i noticed cus i got a build error when trying to install this provider (in 
virtualenv on mac) , which installed pyvel which depends on leveldb headers


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] dstandish edited a comment on pull request #14105: Add Google leveldb hook (#13109)

2021-05-10 Thread GitBox


dstandish edited a comment on pull request #14105:
URL: https://github.com/apache/airflow/pull/14105#issuecomment-837350318


   with this change it seems leveldb is a required system dependency for the 
google provider. 
   
   it doesn't seem right to require users to install leveldb on their machines 
just to use GCP hooks.  anyone have any thoughts on this?
   
   i noticed cus i got a build error (when installing in virtualenv on mac) 
when trying to install this provider, which installed pyvel which depends on 
leveldb headers


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] malthe opened a new pull request #15769: Update croniter to 1.0.x series

2021-05-10 Thread GitBox


malthe opened a new pull request #15769:
URL: https://github.com/apache/airflow/pull/15769


   
   
   The [croniter](https://pypi.org/project/croniter/#changelog) project 
transitioned to semver2 as part of recently making it to 1.0.x.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] mik-laj commented on pull request #15755: Save pod name to xcom for KubernetesPodOperator

2021-05-10 Thread GitBox


mik-laj commented on pull request #15755:
URL: https://github.com/apache/airflow/pull/15755#issuecomment-837350803


   Tests faield. Can you look at it?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] dstandish commented on pull request #14105: Add Google leveldb hook (#13109)

2021-05-10 Thread GitBox


dstandish commented on pull request #14105:
URL: https://github.com/apache/airflow/pull/14105#issuecomment-837350318


   with this change it seems leveldb is a required system dependency for the 
google provider. 
   
   it doesn't seem right to require users to install leveldb on their machines 
just to use GCP hooks.  anyone have any thoughts on this?
   
   i noticed cus i got a build error when trying to install this provider, 
which installed pyvel which depends on leveldb headers


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] xinbinhuang edited a comment on pull request #15743: Redirect forked process output to logger

2021-05-10 Thread GitBox


xinbinhuang edited a comment on pull request #15743:
URL: https://github.com/apache/airflow/pull/15743#issuecomment-837266936


   > I'm not yet convinced this is an actual problem in a task run by airflow 
-- I haven't been able to reproduce this with a python operator
   
   Can you reproduce the example in the original issue? I can confirm that on 
my end, the output from the subprocess spawned by the python callable won't log 
into the task logs file but still print out to the terminal. This only happens 
when the task is executed on Unix OS because it will try to `fork` the process 
instead of doing `Popen` (Popen works fine). The reason is what I described in 
the PR description. (Just updated the description with more details so 
hopefully it's more clear)
   
   I used this command to test the output, as I believe this is the command 
executed by the executor (please do correct me if I'm wrong)
   
   ```bash
   airflow tasks run --local ...
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] boring-cyborg[bot] commented on issue #15768: PythonVirtualenvOperator fails with error from pip execution: Can not perform a '--user' install.

2021-05-10 Thread GitBox


boring-cyborg[bot] commented on issue #15768:
URL: https://github.com/apache/airflow/issues/15768#issuecomment-837296054


   Thanks for opening your first issue here! Be sure to follow the issue 
template!
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] NickPancakes opened a new issue #15768: PythonVirtualenvOperator fails with error from pip execution: Can not perform a '--user' install.

2021-05-10 Thread GitBox


NickPancakes opened a new issue #15768:
URL: https://github.com/apache/airflow/issues/15768


   **Apache Airflow version**: 2.0.2
   
   **Environment**:
   
   - **Hardware configuration**: Macbook Pro 2017
   - **OS**:  macOS X Catalina  10.15.7
   - **Kernel**: Darwin 19.6.0
   - **Others**: Docker 20.10.6, docker-compose 1.29.1
   
   **What happened**:
   
   Running the demo `example_python_operator` dag fails on the 
`virtualenv_python` step. The call to pip via subprocess fails:
   `subprocess.CalledProcessError: Command '['/tmp/venvt3_qnug6/bin/pip', 
'install', 'colorama==0.4.0']' returned non-zero exit status 1.`
   
   The error coming from pip is: `ERROR: Can not perform a '--user' install. 
User site-packages are not visible in this virtualenv.`
   
   **What you expected to happen**:
   
   The call to pip succeeds, and the colorama dependency is installed into the 
virtualenv without attempting to install to user packages. The 
`example_python_operator` dag execution succeeds.
   
   **How to reproduce it**:
   
   Setup airflow 2.0.2 in docker as detailed in the Quickstart guide: 
https://airflow.apache.org/docs/apache-airflow/stable/start/docker.html
   Once running, enable and manually trigger the `example_python_operator` dag 
via the webUI.
   The dag will fail at the `virtualenv_python` task.
   
   **Anything else we need to know**:
   
   Not a problem with the Airflow 2.0.1 docker-compose. Fairly certain this is 
due to the addition of the `PIP_USER` environment variable being set to `true` 
in this PR: https://github.com/apache/airflow/pull/14125
   
   My proposed solution would be to prepend `PIP_USER=false` to the 
construction of the call to pip within `utils/python_virtualenv.py` here: 
https://github.com/apache/airflow/blob/25caeda58b50eae6ef425a52e794504bc63855d1/airflow/utils/python_virtualenv.py#L30


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] xinbinhuang edited a comment on pull request #15743: Redirect forked process output to logger

2021-05-10 Thread GitBox


xinbinhuang edited a comment on pull request #15743:
URL: https://github.com/apache/airflow/pull/15743#issuecomment-837266936


   > I'm not yet convinced this is an actual problem in a task run by airflow 
-- I haven't been able to reproduce this with a python operator
   
   Can you reproduce the example in the original issue? I can confirm that on 
my end, the output from the subprocess spawned by the python callable won't log 
into the task logs file but still print out to the terminal. This only happens 
when the task is executed on Unix OS because it will try to `fork` the process 
instead of doing `Popen` (Popen works fine). The reason is what I described in 
the PR description.
   
   I used this command to test the output, as I believe this is the command 
executed by the executor (please do correct me if I'm wrong)
   
   ```bash
   airflow tasks run --local ...
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] xinbinhuang edited a comment on pull request #15743: Redirect forked process output to logger

2021-05-10 Thread GitBox


xinbinhuang edited a comment on pull request #15743:
URL: https://github.com/apache/airflow/pull/15743#issuecomment-837266936


   > I'm not yet convinced this is an actual problem in a task run by airflow 
-- I haven't been able to reproduce this with a python operator
   
   Can you reproduce the example in the original issue? I can confirm that on 
my end, the output from the subprocess spawned by the python callable won't log 
into the task logs file but still print out to the terminal. This only happens 
when the task is executed on Unix OS because it will try to `fork` the process 
instead of doing `Popen` (Popen works fine).
   
   I used this command to test the output, as I believe this is the command 
executed by the executor (please do correct me if I'm wrong)
   
   ```bash
   airflow tasks run --local ...
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] xinbinhuang edited a comment on pull request #15743: Redirect forked process output to logger

2021-05-10 Thread GitBox


xinbinhuang edited a comment on pull request #15743:
URL: https://github.com/apache/airflow/pull/15743#issuecomment-837266936


   > I'm not yet convinced this is an actual problem in a task run by airflow 
-- I haven't been able to reproduce this with a python operator
   
   Can you reproduce the example in the original issue? I can confirm that on 
my end, the output from the subprocess spawned by the python callable can not 
log into the task logs file but still print out to the terminal. This only 
happens when the task is executed on Unix OS because it will try to `fork` the 
process instead of doing `Popen` (Popen works fine).
   
   I used this command to test the output, as I believe this is the command 
executed by the executor (please do correct me if I'm wrong)
   
   ```bash
   airflow tasks run --local ...
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] xinbinhuang edited a comment on pull request #15743: Redirect forked process output to logger

2021-05-10 Thread GitBox


xinbinhuang edited a comment on pull request #15743:
URL: https://github.com/apache/airflow/pull/15743#issuecomment-837266936


   > I'm not yet convinced this is an actual problem in a task run by airflow 
-- I haven't been able to reproduce this with a python operator
   
   Can you reproduce the example in the original issue? I can confirm that on 
my end, the output from the subprocess spawned by the python callable can not 
log into the task logs file but still print out to the terminal. The command 
that I used to test the output is:
   
   ```bash
   airflow tasks run --local ...
   ```
   Correct me if I'm wrong, I believe this is the command executed by the 
executor.
   
   This only happens when the task is executed on Unix OS because it will try 
to `fork` the process.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] malthe commented on a change in pull request #15581: Add bindvars Xcom result to Oracle operator

2021-05-10 Thread GitBox


malthe commented on a change in pull request #15581:
URL: https://github.com/apache/airflow/pull/15581#discussion_r629657710



##
File path: airflow/providers/oracle/operators/oracle.py
##
@@ -62,4 +68,30 @@ def __init__(
 def execute(self, context) -> None:
 self.log.info('Executing: %s', self.sql)
 hook = OracleHook(oracle_conn_id=self.oracle_conn_id)
-hook.run(self.sql, autocommit=self.autocommit, 
parameters=self.parameters)
+
+def handler(cur):
+bindvars = cur.bindvars
+
+if isinstance(bindvars, list):
+bindvars = [v.getvalue() for v in bindvars]
+elif isinstance(bindvars, dict):
+bindvars = {n: v.getvalue() for (n, v) in bindvars.items()}
+else:
+raise TypeError(bindvars)
+
+return {
+"bindvars": bindvars,
+"rows": cur.fetchall(),
+}
+
+kwargs = {
+"autocommit": self.autocommit,
+"parameters": self.parameters,
+}
+
+# For backwards compatibility, if the hook implementation does not

Review comment:
   @potiuk this is how I ended up providing compatibility with the older 
hook interface.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] xinbinhuang commented on pull request #15743: Redirect forked process output to logger

2021-05-10 Thread GitBox


xinbinhuang commented on pull request #15743:
URL: https://github.com/apache/airflow/pull/15743#issuecomment-837266936


   > I'm not yet convinced this is an actual problem in a task run by airflow 
-- I haven't been able to reproduce this with a python operator
   
   Can you reproduce the example in the original issue? I can confirm that on 
my end, the output from the subprocess spawned by the python callable can print 
out to the terminal but not logging into the task logs file.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] SamWheating commented on issue #15759: flask error: ValueError: instance() only accepts datetime objects.

2021-05-10 Thread GitBox


SamWheating commented on issue #15759:
URL: https://github.com/apache/airflow/issues/15759#issuecomment-837261174


   Fixed in https://github.com/apache/airflow/pull/14416


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] lsowen commented on issue #13853: Clearing of historic Task or DagRuns leads to failed DagRun

2021-05-10 Thread GitBox


lsowen commented on issue #13853:
URL: https://github.com/apache/airflow/issues/13853#issuecomment-837226456


   Is this or #14265 a candidate for 
https://github.com/apache/airflow/milestone/21?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] dvd9604 opened a new issue #15767: Webserver changes the base path for the log on rerun of task 404

2021-05-10 Thread GitBox


dvd9604 opened a new issue #15767:
URL: https://github.com/apache/airflow/issues/15767


   
   **Apache Airflow version**: 2.0.0
   
   **Environment**:
   - **OS** (e.g. from /etc/os-release): RHEL7
   - **Python**: 3.8.6
   - **Executor**: CeleryExecutor
   - **Workers**: 4 worker nodes
   - **logs dir**: /opt/airflow/logs/
   
   
   **What happened**:
   
   Scenario: Assume there are four workers labeled `host_a`, `host_b`, 
`host_c`, and `host_d`. The webserver and scheduler are running on another host 
`host_z`. When a task is ran for the first time it executes on `host_a` and the 
log file is created locally on that host. If the same task is cleared and ran 
again, it may execute on another host such as `host_b`. When navigating back to 
`1.log` after looking at `2.log` the webserver replaces hostname from `host_a` 
to `host_b` leading to the log file not being found.
   
   1. Task runs for first time `1.log` is created on `host_a`
   ```
   *** Log file does not exist: 
/opt/airflow/logs/dag_foo/task_bar/2021-05-10T15:54:44.662671+00:00/1.log
   *** Fetching from: 
http://hosta.domain.com:8793/log/dag_foo/task_bar/2021-05-10T15:54:44.662671+00:00/1.log
   
   ```
   
   2. Manually clear and run task again `2.log` created on `host_b`
   
   
   ```
   *** Log file does not exist: 
/opt/airflow/logs/dag_foo/task_bar/2021-05-10T15:54:44.662671+00:00/2.log
   *** Fetching from: 
http://hostb.domain.com:8793/log/dag_foo/task_bar/2021-05-10T15:54:44.662671+00:00/2.log
    rest of log
   ```
   
   
   3. Navigate back to `1.log`. Host has been replaced from `host_a` to 
`host_b` **hostb.domain.com:8793**
   
   ```
   *** Log file does not exist: 
/opt/airflow/logs/dag_foo/task_bar/2021-05-10T15:54:44.662671+00:00/1.log
   *** Fetching from: 
http://hostb.domain.com:8793/log/dag_foo/task_bar/2021-05-10T15:54:44.662671+00:00/1.log
   *** Failed to fetch log file from worker. 404 Client Error: NOT FOUND for 
url: 
http://hosta.domain.com:8793/log/dag_foo/task_bar/2021-05-10T15:54:44.662671+00:00/1.log
   ```
   
   
   **What you expected to happen**:
   Webserver should remember where a task ran.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] boring-cyborg[bot] commented on issue #15767: Webserver changes the base path for the log on rerun of task 404

2021-05-10 Thread GitBox


boring-cyborg[bot] commented on issue #15767:
URL: https://github.com/apache/airflow/issues/15767#issuecomment-837160531


   Thanks for opening your first issue here! Be sure to follow the issue 
template!
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[airflow] branch master updated (dab10d9 -> 25caeda)

2021-05-10 Thread ryanahamilton
This is an automated email from the ASF dual-hosted git repository.

ryanahamilton pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/airflow.git.


from dab10d9  MongoToS3Operator failed when running with a single query 
(not aggregate pipeline) (#15680)
 add 25caeda  Remove unused dependency (#15762)

No new revisions were added by this update.

Summary of changes:
 airflow/www/package.json |  1 -
 airflow/www/yarn.lock| 32 +---
 2 files changed, 1 insertion(+), 32 deletions(-)


[GitHub] [airflow] ryanahamilton merged pull request #15762: Remove unused dependency

2021-05-10 Thread GitBox


ryanahamilton merged pull request #15762:
URL: https://github.com/apache/airflow/pull/15762


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] ephraimbuddy closed pull request #15726: Add more integration tests to the kubernetes executor

2021-05-10 Thread GitBox


ephraimbuddy closed pull request #15726:
URL: https://github.com/apache/airflow/pull/15726


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] dstandish commented on issue #15016: OdbcHook string values in connect_kwargs dict converts to None

2021-05-10 Thread GitBox


dstandish commented on issue #15016:
URL: https://github.com/apache/airflow/issues/15016#issuecomment-837110088


   > Why is clean_bool being called in the first place for a user-provided 
dictionary? I'm not sure how this is necessary because the user can provide a 
literal boolean value in the dictionary if needed, no? If in the event that a 
driver needs to take a case-sensitive boolean string for some parameter, then 
clean_bool would make it impossible to provide such a value.
   
   TLDR I agree we should remove `clean_bool`
   
   As to why it was put in, in the first place, here's what I think happened.
   
   Observe what happens with airflow's connection URI format when we try to 
pass a boolean value:
   
   ```python
   >>> c = Connection(conn_id='hello', uri='hello://?my_val=True')
   >>> c.extra_dejson
   {'my_val': 'True'}
   ```
   
   It's impossible to produce a json object with boolean values.
   
   So when you are using top level key-value pairs in conn `extra` then in some 
cases it makes sense to cast to bool.
   
   I suspect maybe initially in the development of this hook the connect kwargs 
were top level within `extra`, where doing this cast would make sense.
   
   But when dealing with nested json, the boolean vs. string issue becomes 
irrelevant and you have new problems to solve.  Namely, at the time this hook 
was merged, airflow's conn URI did not support nested json.  So this hook did 
not actually allow for storage of `connect_kwargs` in extra when using the URI 
format.  For that, you'd have to add a json.loads-if-str conversion 
[here](https://github.com/apache/airflow/blob/2bee173ef71ab67a288a5513df37914e3dd4a6ce/airflow/providers/odbc/hooks/odbc.py#L165).
   But now that we have [support for arbitrary json in conn 
extra](https://github.com/apache/airflow/pull/15100), there's no need for such 
a conversion.
   
   So since `connect_kwargs` is nested json, there's no valid reason for 
converting to bool, and I suspect this was just an oversight, and accordingly, 
even though it could be fixed, it is best to remove.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] aneesh-joseph commented on a change in pull request #9973: Improve compatibility with mssql

2021-05-10 Thread GitBox


aneesh-joseph commented on a change in pull request #9973:
URL: https://github.com/apache/airflow/pull/9973#discussion_r629543337



##
File path: airflow/www/security.py
##
@@ -347,19 +347,23 @@ def can_read_dag(self, dag_id, user=None) -> bool:
 if not user:
 user = g.user
 prefixed_dag_id = self.prefixed_dag_id(dag_id)
-return self._has_view_access(
-user, permissions.ACTION_CAN_READ, permissions.RESOURCE_DAG
-) or self._has_view_access(user, permissions.ACTION_CAN_READ, 
prefixed_dag_id)
+return (
+self._has_view_access(user, permissions.ACTION_CAN_READ, 
permissions.RESOURCE_DAG)
+or self._has_view_access(user, permissions.ACTION_CAN_READ, 
prefixed_dag_id)
+or False
+)
 
 def can_edit_dag(self, dag_id, user=None) -> bool:
 """Determines whether a user has DAG edit access."""
 if not user:
 user = g.user
 prefixed_dag_id = self.prefixed_dag_id(dag_id)
 
-return self._has_view_access(
-user, permissions.ACTION_CAN_EDIT, permissions.RESOURCE_DAG
-) or self._has_view_access(user, permissions.ACTION_CAN_EDIT, 
prefixed_dag_id)
+return (
+self._has_view_access(user, permissions.ACTION_CAN_EDIT, 
permissions.RESOURCE_DAG)
+or self._has_view_access(user, permissions.ACTION_CAN_EDIT, 
prefixed_dag_id)
+or False

Review comment:
   sure, done 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] jedcunningham closed pull request #15729: Clean up unnecessary Airflow config in helm chart

2021-05-10 Thread GitBox


jedcunningham closed pull request #15729:
URL: https://github.com/apache/airflow/pull/15729


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] ConstantinoSchillebeeckx opened a new issue #15766: execution_timeout not working

2021-05-10 Thread GitBox


ConstantinoSchillebeeckx opened a new issue #15766:
URL: https://github.com/apache/airflow/issues/15766


   
   **Apache Airflow version**: 2.0.2
   
   **Environment**:
   
   - **Cloud provider or hardware configuration**: AWS
   
   **What happened**:
   
   I have a custom operator that reads a SQL file and simply executes it using 
the MySQL hook; the operator works just fine though we're having an issue (not 
related to Airflow) with one of the queries that will intermittently take very 
long. As a temporary work around, we're setting the `execution_timeout` to a 
small number (10s) and then using the `on_retry_callback` to do cleanup.
   
   When this process works as expected, our log looks like:
   ```
   [2021-05-08 19:48:20,771] {base_task_runner.py:62} DEBUG - Planning to run 
as the  user
   ...
   ...
   [2021-05-08 19:48:35,385] {timeout.py:36} ERROR - Process timed out, PID: 
10082
   [2021-05-08 19:48:35,385] {taskinstance.py:595} DEBUG - Refreshing 
TaskInstance  
from DB
   [2021-05-08 19:48:35,391] {taskinstance.py:630} DEBUG - Refreshed 
TaskInstance 
   [2021-05-08 19:48:35,392] {taskinstance.py:1482} ERROR - Task failed with 
exception
   Traceback (most recent call last):
 File 
"/usr/local/lib/python3.7/site-packages/airflow/models/taskinstance.py", line 
1138, in _run_raw_task
   self._prepare_and_execute_task_with_callbacks(context, task)
 File 
"/usr/local/lib/python3.7/site-packages/airflow/models/taskinstance.py", line 
1311, in _prepare_and_execute_task_with_callbacks
   result = self._execute_task(context, task_copy)
 File 
"/usr/local/lib/python3.7/site-packages/airflow/models/taskinstance.py", line 
1336, in _execute_task
   result = task_copy.execute(context=context)
 File "/usr/local/airflow/pipelines/operators/sql_file_operator.py", line 
45, in execute
   self.writer._hook.run(self.transform)
 File "/usr/local/lib/python3.7/site-packages/airflow/hooks/dbapi.py", line 
184, in run
   cur.execute(sql_statement)
 File "/usr/local/lib/python3.7/site-packages/MySQLdb/cursors.py", line 
255, in execute
   self.errorhandler(self, exc, value)
 File "/usr/local/lib/python3.7/site-packages/MySQLdb/connections.py", line 
50, in defaulterrorhandler
   raise errorvalue
 File "/usr/local/lib/python3.7/site-packages/MySQLdb/cursors.py", line 
252, in execute
   res = self._query(query)
 File "/usr/local/lib/python3.7/site-packages/MySQLdb/cursors.py", line 
378, in _query
   db.query(q)
 File "/usr/local/lib/python3.7/site-packages/MySQLdb/connections.py", line 
280, in query
   _mysql.connection.query(self, query)
 File "/usr/local/lib/python3.7/site-packages/airflow/utils/timeout.py", 
line 37, in handle_timeout
   raise AirflowTaskTimeout(self.error_message)
   airflow.exceptions.AirflowTaskTimeout: Timeout, PID: 10082
   [2021-05-08 19:48:35,393] {taskinstance.py:1891} DEBUG - Task Duration set 
to 14.59135
   ```
   * notice the task duration (15s) is close to our timeout duration (10s)
   
   However sometimes we're noticing the timeout doesn't seem to be killing the 
task on time:
   ```
   [2021-05-10 01:49:33,536] {base_task_runner.py:62} DEBUG - Planning to run 
as the  user
   ...
   ...
   [2021-05-10 05:08:40,874] {timeout.py:36} ERROR - Process timed out, PID: 
15864
   [2021-05-10 05:08:40,874] {taskinstance.py:595} DEBUG - Refreshing 
TaskInstance  
from DB
   [2021-05-10 05:08:40,880] {taskinstance.py:630} DEBUG - Refreshed 
TaskInstance 
   [2021-05-10 05:08:40,881] {taskinstance.py:1482} ERROR - Task failed with 
exception
   Traceback (most recent call last):
 File 
"/usr/local/lib/python3.7/site-packages/airflow/models/taskinstance.py", line 
1138, in _run_raw_task
   self._prepare_and_execute_task_with_callbacks(context, task)
 File 
"/usr/local/lib/python3.7/site-packages/airflow/models/taskinstance.py", line 
1311, in _prepare_and_execute_task_with_callbacks
   result = self._execute_task(context, task_copy)
 File 
"/usr/local/lib/python3.7/site-packages/airflow/models/taskinstance.py", line 
1336, in _execute_task
   result = task_copy.execute(context=context)
 File "/usr/local/airflow/pipelines/operators/sql_file_operator.py", line 
45, in execute
   self.writer._hook.run(self.transform)
 File "/usr/local/lib/python3.7/site-packages/airflow/hooks/dbapi.py", line 
184, in run
   cur.execute(sql_statement)
 File "/usr/local/lib/python3.7/site-packages/MySQLdb/cursors.py", line 
255, in execute
   self.errorhandler(self, exc, value)
 File "/usr/local/lib/python3.7/site-packages/MySQLdb/connections.py", line 
50, in defaulterrorhandler
   raise errorvalue
 File "/usr/local/lib/python3.7/site-packages/MySQLdb/cursors.py", line 
252, in execute
   res = self._query(query)
 File "/usr/local/lib/python3.7/site-packages/MySQLdb/cursors.py", line 
378, in _query
   db.query(q)
 File "/usr/local/lib/pytho

[jira] [Commented] (AIRFLOW-4922) If a task crashes, host name is not committed to the database so logs aren't able to be seen in the UI

2021-05-10 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-4922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17342007#comment-17342007
 ] 

ASF GitHub Bot commented on AIRFLOW-4922:
-

ashb edited a comment on pull request #6722:
URL: https://github.com/apache/airflow/pull/6722#issuecomment-836943220


   Not specifically - we don't plan at the bug level, just high level features.
   
   So either someone has to tackle this themselves (which we will help you do), 
or engage in some kind of commercial relationship with a committer/company 
employing committers to get this bug fixed in a fixed time frame.
   
   That said I remember a change that was committed a few days ago (for 
something else, not specifically this bug) that may fix this in 2.1.0 when it's 
out, which should be "days to weeks" timeframe.
   
   Edit: the change in taskinstance.py in 
https://github.com/apache/airflow/commit/817b599234dca050438ee04bc6944d32bc032694
 is what I was thinking of


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> If a task crashes, host name is not committed to the database so logs aren't 
> able to be seen in the UI
> --
>
> Key: AIRFLOW-4922
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4922
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: logging
>Affects Versions: 1.10.3
>Reporter: Andrew Harmon
>Assignee: wanghong-T
>Priority: Major
>
> Sometimes when a task fails, the log show the following
> {code}
> *** Log file does not exist: 
> /usr/local/airflow/logs/my_dag/my_task/2019-07-07T09:00:00+00:00/1.log*** 
> Fetching from: 
> http://:8793/log/my_dag/my_task/2019-07-07T09:00:00+00:00/1.log*** 
> Failed to fetch log file from worker. Invalid URL 
> 'http://:8793/log/my_dag/my_task/2019-07-07T09:00:00+00:00/1.log': No host 
> supplied
> {code}
> I believe this is due to the fact that the row is not committed to the 
> database until after the task finishes. 
> https://github.com/apache/airflow/blob/a1f9d9a03faecbb4ab52def2735e374b2e88b2b9/airflow/models/taskinstance.py#L857



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] ashb edited a comment on pull request #6722: [AIRFLOW-4922]Fix task get log by Web UI

2021-05-10 Thread GitBox


ashb edited a comment on pull request #6722:
URL: https://github.com/apache/airflow/pull/6722#issuecomment-836943220


   Not specifically - we don't plan at the bug level, just high level features.
   
   So either someone has to tackle this themselves (which we will help you do), 
or engage in some kind of commercial relationship with a committer/company 
employing committers to get this bug fixed in a fixed time frame.
   
   That said I remember a change that was committed a few days ago (for 
something else, not specifically this bug) that may fix this in 2.1.0 when it's 
out, which should be "days to weeks" timeframe.
   
   Edit: the change in taskinstance.py in 
https://github.com/apache/airflow/commit/817b599234dca050438ee04bc6944d32bc032694
 is what I was thinking of


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] github-actions[bot] commented on pull request #15762: Remove unused dependency

2021-05-10 Thread GitBox


github-actions[bot] commented on pull request #15762:
URL: https://github.com/apache/airflow/pull/15762#issuecomment-836949243


   The PR is likely OK to be merged with just subset of tests for default 
Python and Database versions without running the full matrix of tests, because 
it does not modify the core of Airflow. If the committers decide that the full 
tests matrix is needed, they will add the label 'full tests needed'. Then you 
should rebase to the latest master or amend the last commit of the PR, and push 
it with --force-with-lease.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (AIRFLOW-4922) If a task crashes, host name is not committed to the database so logs aren't able to be seen in the UI

2021-05-10 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-4922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17342004#comment-17342004
 ] 

ASF GitHub Bot commented on AIRFLOW-4922:
-

ashb commented on pull request #6722:
URL: https://github.com/apache/airflow/pull/6722#issuecomment-836943220


   Not specifically - we don't plan at the bug level, just high level features.
   
   So either someone has to tackle this themselves (which we will help you do), 
or engage in some kind of commercial relationship with a committer/company 
employing committers to get this bug fixed in a fixed time frame.
   
   That said I remember a change that was committed a few days ago (for 
something else, not specifically this bug) that may fix this in 2.1.0 when it's 
out, which should be "days to weeks" timeframe


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> If a task crashes, host name is not committed to the database so logs aren't 
> able to be seen in the UI
> --
>
> Key: AIRFLOW-4922
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4922
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: logging
>Affects Versions: 1.10.3
>Reporter: Andrew Harmon
>Assignee: wanghong-T
>Priority: Major
>
> Sometimes when a task fails, the log show the following
> {code}
> *** Log file does not exist: 
> /usr/local/airflow/logs/my_dag/my_task/2019-07-07T09:00:00+00:00/1.log*** 
> Fetching from: 
> http://:8793/log/my_dag/my_task/2019-07-07T09:00:00+00:00/1.log*** 
> Failed to fetch log file from worker. Invalid URL 
> 'http://:8793/log/my_dag/my_task/2019-07-07T09:00:00+00:00/1.log': No host 
> supplied
> {code}
> I believe this is due to the fact that the row is not committed to the 
> database until after the task finishes. 
> https://github.com/apache/airflow/blob/a1f9d9a03faecbb4ab52def2735e374b2e88b2b9/airflow/models/taskinstance.py#L857



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] ashb commented on pull request #6722: [AIRFLOW-4922]Fix task get log by Web UI

2021-05-10 Thread GitBox


ashb commented on pull request #6722:
URL: https://github.com/apache/airflow/pull/6722#issuecomment-836943220


   Not specifically - we don't plan at the bug level, just high level features.
   
   So either someone has to tackle this themselves (which we will help you do), 
or engage in some kind of commercial relationship with a committer/company 
employing committers to get this bug fixed in a fixed time frame.
   
   That said I remember a change that was committed a few days ago (for 
something else, not specifically this bug) that may fix this in 2.1.0 when it's 
out, which should be "days to weeks" timeframe


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] john-jac commented on issue #15306: Support Serialized DAGs on CLI Commands

2021-05-10 Thread GitBox


john-jac commented on issue #15306:
URL: https://github.com/apache/airflow/issues/15306#issuecomment-836941555


   > list_dags is fine -- backfill won't work on serialized dags.
   
   @kaxil could you provide details as to why backfill can't run on serialized 
dags?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] mik-laj commented on a change in pull request #15755: Save pod name to xcom for KubernetesPodOperator

2021-05-10 Thread GitBox


mik-laj commented on a change in pull request #15755:
URL: https://github.com/apache/airflow/pull/15755#discussion_r629510571



##
File path: airflow/providers/cncf/kubernetes/operators/kubernetes_pod.py
##
@@ -338,6 +338,7 @@ def execute(self, context) -> Optional[str]:
 )
 
 self.pod = self.create_pod_request_obj()
+pod_name = self.pod.metadata.name

Review comment:
   ```suggestion
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] mik-laj commented on a change in pull request #15755: Save pod name to xcom for KubernetesPodOperator

2021-05-10 Thread GitBox


mik-laj commented on a change in pull request #15755:
URL: https://github.com/apache/airflow/pull/15755#discussion_r629510178



##
File path: airflow/providers/cncf/kubernetes/operators/kubernetes_pod.py
##
@@ -367,8 +366,10 @@ def execute(self, context) -> Optional[str]:
 self.log.info("creating pod with labels %s and launcher %s", 
labels, launcher)
 final_state, _, result = 
self.create_new_pod_for_operator(labels, launcher)
 if final_state != State.SUCCESS:
-status = 
self.client.read_namespaced_pod(self.pod.metadata.name, self.namespace)
-raise AirflowException(f'Pod {self.pod.metadata.name} returned 
a failure: {status}')
+status = self.client.read_namespaced_pod(pod_name, 
self.namespace)
+raise AirflowException(f'Pod {pod_name} returned a failure: 
{status}')

Review comment:
   ```suggestion
   status = 
self.client.read_namespaced_pod(self.pod.metadata.name, self.namespace)
   raise AirflowException(f'Pod {self.pod.metadata.name} 
returned a failure: {status}')
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] mik-laj commented on a change in pull request #15755: Save pod name to xcom for KubernetesPodOperator

2021-05-10 Thread GitBox


mik-laj commented on a change in pull request #15755:
URL: https://github.com/apache/airflow/pull/15755#discussion_r629510342



##
File path: airflow/providers/cncf/kubernetes/operators/kubernetes_pod.py
##
@@ -367,8 +366,10 @@ def execute(self, context) -> Optional[str]:
 self.log.info("creating pod with labels %s and launcher %s", 
labels, launcher)
 final_state, _, result = 
self.create_new_pod_for_operator(labels, launcher)
 if final_state != State.SUCCESS:
-status = 
self.client.read_namespaced_pod(self.pod.metadata.name, self.namespace)
-raise AirflowException(f'Pod {self.pod.metadata.name} returned 
a failure: {status}')
+status = self.client.read_namespaced_pod(pod_name, 
self.namespace)
+raise AirflowException(f'Pod {pod_name} returned a failure: 
{status}')
+context['task_instance'].xcom_push(key='pod_name', value=pod_name)

Review comment:
   ```suggestion
   context['task_instance'].xcom_push(key='pod_name', 
value=self.pod.metadata.name)
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] mik-laj commented on a change in pull request #15755: Save pod name to xcom for KubernetesPodOperator

2021-05-10 Thread GitBox


mik-laj commented on a change in pull request #15755:
URL: https://github.com/apache/airflow/pull/15755#discussion_r629509871



##
File path: airflow/providers/cncf/kubernetes/operators/kubernetes_pod.py
##
@@ -367,8 +366,10 @@ def execute(self, context) -> Optional[str]:
 self.log.info("creating pod with labels %s and launcher %s", 
labels, launcher)
 final_state, _, result = 
self.create_new_pod_for_operator(labels, launcher)
 if final_state != State.SUCCESS:
-status = 
self.client.read_namespaced_pod(self.pod.metadata.name, self.namespace)
-raise AirflowException(f'Pod {self.pod.metadata.name} returned 
a failure: {status}')
+status = self.client.read_namespaced_pod(pod_name, 
self.namespace)
+raise AirflowException(f'Pod {pod_name} returned a failure: 
{status}')

Review comment:
   `pod_name` != `self.pod.metadata.name` in all cases. When we reattach to 
an existing task, then we have the old name. See: 
   
https://github.com/apache/airflow/blob/0ec79d036b32c0c437b38972f9ecb6be64638b9b/airflow/providers/cncf/kubernetes/operators/kubernetes_pod.py#L403




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] ashb commented on a change in pull request #9973: Improve compatibility with mssql

2021-05-10 Thread GitBox


ashb commented on a change in pull request #9973:
URL: https://github.com/apache/airflow/pull/9973#discussion_r629501511



##
File path: airflow/www/security.py
##
@@ -347,19 +347,23 @@ def can_read_dag(self, dag_id, user=None) -> bool:
 if not user:
 user = g.user
 prefixed_dag_id = self.prefixed_dag_id(dag_id)
-return self._has_view_access(
-user, permissions.ACTION_CAN_READ, permissions.RESOURCE_DAG
-) or self._has_view_access(user, permissions.ACTION_CAN_READ, 
prefixed_dag_id)
+return (
+self._has_view_access(user, permissions.ACTION_CAN_READ, 
permissions.RESOURCE_DAG)
+or self._has_view_access(user, permissions.ACTION_CAN_READ, 
prefixed_dag_id)
+or False
+)
 
 def can_edit_dag(self, dag_id, user=None) -> bool:
 """Determines whether a user has DAG edit access."""
 if not user:
 user = g.user
 prefixed_dag_id = self.prefixed_dag_id(dag_id)
 
-return self._has_view_access(
-user, permissions.ACTION_CAN_EDIT, permissions.RESOURCE_DAG
-) or self._has_view_access(user, permissions.ACTION_CAN_EDIT, 
prefixed_dag_id)
+return (
+self._has_view_access(user, permissions.ACTION_CAN_EDIT, 
permissions.RESOURCE_DAG)
+or self._has_view_access(user, permissions.ACTION_CAN_EDIT, 
prefixed_dag_id)
+or False

Review comment:
   ```
   def _has_view_access(self, user, action, resource):
   return bool(super()._has_view_access(user, action, resource)
   ```
   
   would probably be a better fix (along with a comment saying why we need it 
rather than having to change every place we use this.

##
File path: airflow/www/security.py
##
@@ -348,19 +348,21 @@ def can_read_dag(self, dag_id, user=None) -> bool:
 if not user:
 user = g.user
 dag_resource_name = permissions.resource_name_for_dag(dag_id)
-return self._has_view_access(
-user, permissions.ACTION_CAN_READ, permissions.RESOURCE_DAG
-) or self._has_view_access(user, permissions.ACTION_CAN_READ, 
dag_resource_name)
+return bool(
+self._has_view_access(user, permissions.ACTION_CAN_READ, 
permissions.RESOURCE_DAG)
+or self._has_view_access(user, permissions.ACTION_CAN_READ, 
dag_resource_name)
+)

Review comment:
   Ah, see https://github.com/apache/airflow/pull/9973#discussion_r629501511




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] ashb commented on a change in pull request #9973: Improve compatibility with mssql

2021-05-10 Thread GitBox


ashb commented on a change in pull request #9973:
URL: https://github.com/apache/airflow/pull/9973#discussion_r629500122



##
File path: airflow/www/security.py
##
@@ -348,19 +348,21 @@ def can_read_dag(self, dag_id, user=None) -> bool:
 if not user:
 user = g.user
 dag_resource_name = permissions.resource_name_for_dag(dag_id)
-return self._has_view_access(
-user, permissions.ACTION_CAN_READ, permissions.RESOURCE_DAG
-) or self._has_view_access(user, permissions.ACTION_CAN_READ, 
dag_resource_name)
+return bool(
+self._has_view_access(user, permissions.ACTION_CAN_READ, 
permissions.RESOURCE_DAG)
+or self._has_view_access(user, permissions.ACTION_CAN_READ, 
dag_resource_name)
+)

Review comment:
   And why is this a problem? None evaluates falsey in a boolean context.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] jedcunningham commented on a change in pull request #15729: Clean up unnecessary Airflow config in helm chart

2021-05-10 Thread GitBox


jedcunningham commented on a change in pull request #15729:
URL: https://github.com/apache/airflow/pull/15729#discussion_r629486438



##
File path: chart/values.yaml
##
@@ -914,10 +912,6 @@ config:
   elasticsearch:
 json_format: 'True'
 log_id_template: "{dag_id}_{task_id}_{execution_date}_{try_number}"
-  elasticsearch_configs:
-max_retries: 3
-timeout: 30
-retry_timeout: 'True'

Review comment:
   Thanks. I'll probably document that better in the default airflow config 
in a follow up PR.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] ChethanUK opened a new pull request #15765: Introducing BaseCustomK8sApp Sensor and Improving the spark driver logs

2021-05-10 Thread GitBox


ChethanUK opened a new pull request #15765:
URL: https://github.com/apache/airflow/pull/15765


   Introducing BaseCustomK8sApp Sensor to add more Sensors like Flink Cluster 
Sensor etc...
   Just improving the spark sensor logs includes the name of the app and the 
namespace in which it's currently deployed in.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] PeterLambe commented on issue #11901: DAGs remain in the UI after renaming the dag_id in the same python file

2021-05-10 Thread GitBox


PeterLambe commented on issue #11901:
URL: https://github.com/apache/airflow/issues/11901#issuecomment-836848483


   Confirming that pressing "Delete DAG" on the old DAG will remove it from the 
metadata tables properly 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] github-actions[bot] commented on pull request #9973: Improve compatibility with mssql

2021-05-10 Thread GitBox


github-actions[bot] commented on pull request #9973:
URL: https://github.com/apache/airflow/pull/9973#issuecomment-836846212


   [The Workflow run](https://github.com/apache/airflow/actions/runs/828512188) 
is cancelling this PR. Building images for the PR has failed. Follow the 
workflow link to check the reason.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] ephraimbuddy commented on a change in pull request #15726: Add more integration tests to the kubernetes executor

2021-05-10 Thread GitBox


ephraimbuddy commented on a change in pull request #15726:
URL: https://github.com/apache/airflow/pull/15726#discussion_r629455917



##
File path: kubernetes_tests/test_kubernetes_executor.py
##
@@ -173,41 +190,24 @@ def start_dag(self, dag_id, host):
 print(f"Calling [start_dag]#2 {post_string}")
 # Trigger a new dagrun
 result = self.session.post(post_string, json={})
-try:
-result_json = result.json()
-except ValueError:
-result_json = str(result)
+result_json = result.json()
 print(f"Received [start_dag]#2 {result_json}")
 assert result.status_code == 200, f"Could not trigger a DAG-run: 
{result_json}"
-
+execution_date = result_json['execution_date']
+dag_run_id = result_json['dag_run_id']
 time.sleep(1)
-
 get_string = f'http://{host}/api/v1/dags/{dag_id}/dagRuns'
 print(f"Calling [start_dag]#3 {get_string}")
 result = self.session.get(get_string)
 assert result.status_code == 200, f"Could not get DAGRuns: 
{result.json()}"
 result_json = result.json()
 print(f"Received: [start_dag]#3 {result_json}")
-return result_json
-
-def start_job_in_kubernetes(self, dag_id, host):
-result_json = self.start_dag(dag_id=dag_id, host=host)
-dag_runs = result_json['dag_runs']
-assert len(dag_runs) > 0
-execution_date = None
-dag_run_id = None
-for dag_run in dag_runs:
-if dag_run['dag_id'] == dag_id:
-execution_date = dag_run['execution_date']
-dag_run_id = dag_run['dag_run_id']
-break
-assert execution_date is not None, f"No execution_date can be found 
for the dag with {dag_id}"

Review comment:
   This is not needed and it gives the wrong dag_run_id and execution date. 
Since all it does is get the dag_run_id and execution date, I had to remove it 
and get the exact dag_run_id and execution date from the time we triggered the 
DagRuns. 
   This code block tricked me into thinking there was an issue hence my earlier 
commits




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] maryampashmi commented on issue #15451: No module named 'airflow.providers.google.common.hooks.leveldb'

2021-05-10 Thread GitBox


maryampashmi commented on issue #15451:
URL: https://github.com/apache/airflow/issues/15451#issuecomment-836841894


   Using --upgrade, solved the error. I haven't get it anymore. Thanks for your 
help. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] github-actions[bot] commented on pull request #9973: Improve compatibility with mssql

2021-05-10 Thread GitBox


github-actions[bot] commented on pull request #9973:
URL: https://github.com/apache/airflow/pull/9973#issuecomment-836816009


   [The Workflow run](https://github.com/apache/airflow/actions/runs/828407904) 
is cancelling this PR. Building images for the PR has failed. Follow the 
workflow link to check the reason.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] kaxil commented on a change in pull request #15729: Clean up unnecessary Airflow config in helm chart

2021-05-10 Thread GitBox


kaxil commented on a change in pull request #15729:
URL: https://github.com/apache/airflow/pull/15729#discussion_r629432849



##
File path: chart/values.yaml
##
@@ -914,10 +912,6 @@ config:
   elasticsearch:
 json_format: 'True'
 log_id_template: "{dag_id}_{task_id}_{execution_date}_{try_number}"
-  elasticsearch_configs:
-max_retries: 3
-timeout: 30
-retry_timeout: 'True'

Review comment:
   This is valid: 
   
   
https://airflow.apache.org/docs/apache-airflow-providers-elasticsearch/stable/logging.html#writing-logs-to-elasticsearch-over-tls
   
   This line: 
https://github.com/apache/airflow/blob/657384615fafc060f9e2ed925017306705770355/airflow/providers/elasticsearch/log/es_task_handler.py#L76
 read all the configs passed to `elasticsearch_configs` and passes them as 
kwargs to ES client
   
   
https://github.com/apache/airflow/blob/657384615fafc060f9e2ed925017306705770355/airflow/providers/elasticsearch/log/es_task_handler.py#L89




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] kaxil commented on a change in pull request #15729: Clean up unnecessary Airflow config in helm chart

2021-05-10 Thread GitBox


kaxil commented on a change in pull request #15729:
URL: https://github.com/apache/airflow/pull/15729#discussion_r629432849



##
File path: chart/values.yaml
##
@@ -914,10 +912,6 @@ config:
   elasticsearch:
 json_format: 'True'
 log_id_template: "{dag_id}_{task_id}_{execution_date}_{try_number}"
-  elasticsearch_configs:
-max_retries: 3
-timeout: 30
-retry_timeout: 'True'

Review comment:
   This is valid: 
   
   
https://airflow.apache.org/docs/apache-airflow-providers-elasticsearch/stable/logging.html#writing-logs-to-elasticsearch-over-tls
   
   This line: 
https://github.com/apache/airflow/blob/657384615fafc060f9e2ed925017306705770355/airflow/providers/elasticsearch/log/es_task_handler.py#L76
 read all the configs passed to `elasticsearch_configs` and passes them as 
kwargs to ES client




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (AIRFLOW-4922) If a task crashes, host name is not committed to the database so logs aren't able to be seen in the UI

2021-05-10 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-4922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17341943#comment-17341943
 ] 

ASF GitHub Bot commented on AIRFLOW-4922:
-

pelaprat commented on pull request #6722:
URL: https://github.com/apache/airflow/pull/6722#issuecomment-836791611


   @ashb  – sorry to pick on you; any sense of where this bug fits into 
development priorities?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> If a task crashes, host name is not committed to the database so logs aren't 
> able to be seen in the UI
> --
>
> Key: AIRFLOW-4922
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4922
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: logging
>Affects Versions: 1.10.3
>Reporter: Andrew Harmon
>Assignee: wanghong-T
>Priority: Major
>
> Sometimes when a task fails, the log show the following
> {code}
> *** Log file does not exist: 
> /usr/local/airflow/logs/my_dag/my_task/2019-07-07T09:00:00+00:00/1.log*** 
> Fetching from: 
> http://:8793/log/my_dag/my_task/2019-07-07T09:00:00+00:00/1.log*** 
> Failed to fetch log file from worker. Invalid URL 
> 'http://:8793/log/my_dag/my_task/2019-07-07T09:00:00+00:00/1.log': No host 
> supplied
> {code}
> I believe this is due to the fact that the row is not committed to the 
> database until after the task finishes. 
> https://github.com/apache/airflow/blob/a1f9d9a03faecbb4ab52def2735e374b2e88b2b9/airflow/models/taskinstance.py#L857



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] pelaprat commented on pull request #6722: [AIRFLOW-4922]Fix task get log by Web UI

2021-05-10 Thread GitBox


pelaprat commented on pull request #6722:
URL: https://github.com/apache/airflow/pull/6722#issuecomment-836791611


   @ashb  – sorry to pick on you; any sense of where this bug fits into 
development priorities?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] aneesh-joseph commented on a change in pull request #9973: Improve compatibility with mssql

2021-05-10 Thread GitBox


aneesh-joseph commented on a change in pull request #9973:
URL: https://github.com/apache/airflow/pull/9973#discussion_r629416272



##
File path: 
airflow/migrations/versions/83f031fd9f1c_improve_mssql_compatibility.py
##
@@ -0,0 +1,262 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""improve mssql compatibility
+
+Revision ID: 83f031fd9f1c
+Revises: a13f7613ad25
+Create Date: 2021-04-06 12:22:02.197726
+
+"""
+
+from collections import defaultdict
+
+import sqlalchemy as sa
+from alembic import op
+from sqlalchemy.dialects import mssql
+
+# revision identifiers, used by Alembic.
+revision = '83f031fd9f1c'
+down_revision = 'a13f7613ad25'
+branch_labels = None
+depends_on = None
+
+
+def is_table_empty(conn, table_name):
+"""
+This function checks if the mssql table is empty
+:param conn: sql connection object
+:param table_name: table name
+:return: Booelan indicating if the table is present
+"""
+return conn.execute(f'select TOP 1 * from {table_name}').first() is None
+
+
+def get_table_constraints(conn, table_name):
+"""
+This function return primary and unique constraint
+along with column name. some tables like task_instance
+is missing primary key constraint name and the name is
+auto-generated by sql server. so this function helps to
+retrieve any primary or unique constraint name.
+
+:param conn: sql connection object
+:param table_name: table name
+:return: a dictionary of ((constraint name, constraint type), column name) 
of table
+:rtype: defaultdict(list)
+"""
+query = """SELECT tc.CONSTRAINT_NAME , tc.CONSTRAINT_TYPE, ccu.COLUMN_NAME
+ FROM INFORMATION_SCHEMA.TABLE_CONSTRAINTS AS tc
+ JOIN INFORMATION_SCHEMA.CONSTRAINT_COLUMN_USAGE AS ccu ON 
ccu.CONSTRAINT_NAME = tc.CONSTRAINT_NAME
+ WHERE tc.TABLE_NAME = '{table_name}' AND
+ (tc.CONSTRAINT_TYPE = 'PRIMARY KEY' or UPPER(tc.CONSTRAINT_TYPE) = 
'UNIQUE')
+""".format(
+table_name=table_name
+)
+result = conn.execute(query).fetchall()
+constraint_dict = defaultdict(list)
+for constraint, constraint_type, column in result:
+constraint_dict[(constraint, constraint_type)].append(column)
+return constraint_dict
+
+
+def drop_column_constraints(operator, column_name, constraint_dict):
+"""
+Drop a primary key or unique constraint
+
+:param operator: batch_alter_table for the table
+:param constraint_dict: a dictionary of ((constraint name, constraint 
type), column name) of table
+"""
+for constraint, columns in constraint_dict.items():
+if column_name in columns:
+if constraint[1].lower().startswith("primary"):
+operator.drop_constraint(constraint[0], type_='primary')
+elif constraint[1].lower().startswith("unique"):
+operator.drop_constraint(constraint[0], type_='unique')
+
+
+def create_constraints(operator, column_name, constraint_dict):
+"""
+Create a primary key or unique constraint
+
+:param operator: batch_alter_table for the table
+:param constraint_dict: a dictionary of ((constraint name, constraint 
type), column name) of table
+"""
+for constraint, columns in constraint_dict.items():
+if column_name in columns:
+if constraint[1].lower().startswith("primary"):
+operator.create_primary_key(constraint_name=constraint[0], 
columns=columns)
+elif constraint[1].lower().startswith("unique"):
+
operator.create_unique_constraint(constraint_name=constraint[0], 
columns=columns)
+
+
+def _use_date_time2(conn):
+result = conn.execute(
+"""SELECT CASE WHEN CONVERT(VARCHAR(128), SERVERPROPERTY 
('productversion'))
+like '8%' THEN '2000' WHEN CONVERT(VARCHAR(128), SERVERPROPERTY 
('productversion'))
+like '9%' THEN '2005' ELSE '2005Plus' END AS MajorVersion"""
+).fetchone()
+mssql_version = result[0]
+return mssql_version not in ("2000", "2005")
+
+
+def _is_timestamp(conn, table_name, column_name):
+query = f"""SELECT
+TYPE_NAME(C.USER_TYPE_ID) AS DATA_TYPE
+FROM SYS.COLUMNS C
+JOIN SYS.TYPES T
+ON C.USER_TYPE_ID=T.USER_TYPE_

[GitHub] [airflow] aneesh-joseph commented on a change in pull request #9973: Improve compatibility with mssql

2021-05-10 Thread GitBox


aneesh-joseph commented on a change in pull request #9973:
URL: https://github.com/apache/airflow/pull/9973#discussion_r629415571



##
File path: airflow/sensors/smart_sensor.py
##
@@ -391,13 +391,21 @@ def _update_ti_hostname(self, sensor_works, session=None):
 :param session: The sqlalchemy session.
 """
 TI = TaskInstance
-ti_keys = [(x.dag_id, x.task_id, x.execution_date) for x in 
sensor_works]
 
 def update_ti_hostname_with_count(count, ti_keys):
 # Using or_ instead of in_ here to prevent from full table scan.
 tis = (
 session.query(TI)
-.filter(or_(tuple_(TI.dag_id, TI.task_id, TI.execution_date) 
== ti_key for ti_key in ti_keys))
+.filter(
+or_(
+and_(
+TI.dag_id == ti_key.dag_id,
+TI.task_id == ti_key.task_id,
+TI.execution_date == ti_key.execution_date,
+)
+for ti_key in ti_keys
+)
+)

Review comment:
   done, restricted this change to mssql only

##
File path: airflow/models/serialized_dag.py
##
@@ -313,7 +315,9 @@ def get_dag_dependencies(cls, session: Session = None) -> 
Dict[str, List['DagDep
 if session.bind.dialect.name in ["sqlite", "mysql"]:
 for row in session.query(cls.dag_id, func.json_extract(cls.data, 
"$.dag.dag_dependencies")).all():
 dependencies[row[0]] = [DagDependency(**d) for d in 
json.loads(row[1])]
-
+elif session.bind.dialect.name in ["mssql"]:

Review comment:
   done

##
File path: airflow/models/dag.py
##
@@ -896,7 +897,7 @@ def get_num_active_runs(self, external_trigger=None, 
session=None):
 )
 
 if external_trigger is not None:
-query = query.filter(DagRun.external_trigger == external_trigger)
+query = query.filter(DagRun.external_trigger == 
expression.literal(external_trigger))

Review comment:
   done




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




  1   2   >