potiuk commented on a change in pull request #5426: [AIRFLOW-4765] Fix 
DataProcPigOperator execute method
URL: https://github.com/apache/airflow/pull/5426#discussion_r294321244
 
 

 ##########
 File path: airflow/contrib/example_dags/example_gcp_dataproc_pig_operator.py
 ##########
 @@ -0,0 +1,67 @@
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""
+Example Airflow DAG for Google Dataproc PigOperator
 
 Review comment:
   The thing is that we have system tests that start from an empty GCP project, 
create cluster, run tests on it and then tear it down so that we do not have to 
pay for it. And we fully automate it in our Cloud Build automated testing. 
   
   In this particular case we will use that example DAG to document all three 
operators Create/Delete and DataprocOperator itself.. So all three in fact are 
interesting. We just decided not to do it just yet, but we can just do it now. 
See for example similar example DAGs we have here: 
   
   
https://github.com/apache/airflow/blob/master/airflow/contrib/example_dags/example_gcp_compute.py
 - we have documentation START/END around every operator and then they are used 
to have a nice, correct documentation here: 
https://airflow.readthedocs.io/en/latest/howto/operator/gcp/compute.html  (we 
know it is correct because we system test it). 
   
   I am going to come back to AIP-4 which was exactly about making this 
approach (having runnable DAGs as system tests) easy to use by others as well 
(https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-4+Support+for+System+Tests+for+external+systems).
 Once AIP-10 and AIP-7 is done, it is very high on my list.
   
   I agree it should not be in "example_dags" possibly. For sure 
"testable_dags" is much better name - but that's a bigger change that should be 
part of getting rid of contrib IMHO. 
   
   Maybe as an intermediate step we can simply add the missing documentation 
and use the example dags in the same way as we did with other operators? So 
that you can see how useful it is to have it.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to