ashb commented on a change in pull request #6515: [AIRFLOW-XXX] GSoD: How to 
make DAGs production ready
URL: https://github.com/apache/airflow/pull/6515#discussion_r349798847
 
 

 ##########
 File path: docs/best-practices.rst
 ##########
 @@ -0,0 +1,296 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+    or more contributor license agreements.  See the NOTICE file
+    distributed with this work for additional information
+    regarding copyright ownership.  The ASF licenses this file
+    to you under the Apache License, Version 2.0 (the
+    "License"); you may not use this file except in compliance
+    with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+    software distributed under the License is distributed on an
+    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+    KIND, either express or implied.  See the License for the
+    specific language governing permissions and limitations
+    under the License.
+
+Best Practices
+==============
+
+Running Airflow in production is seamless. It comes bundled with all the 
plugins and configs
+necessary to run most of the DAGs. However, you can come across certain 
pitfalls, which can cause occasional errors.
+Let's take a look at what you need to do at various stages to avoid these 
pitfalls, starting from writing the DAG 
+to the actual deployment in the production environment.
+
+
+Writing a DAG
+^^^^^^^^^^^^^^
+Creating a new DAG in Airflow is quite simple. However, there are many things 
that you need to take care of
+to ensure the DAG run or failure does not produce unexpected results.
+
+Creating a task
+---------------
+
+You should treat tasks in Airflow equivalent to transactions in a database. It 
implies that you should never produce
+incomplete results from your tasks. An example is not to produce incomplete 
data in ``HDFS`` or ``S3`` at the end of a task.
+
+Airflow retries a task if it fails. Thus, the tasks should produce the same 
outcome on every re-run.
 
 Review comment:
   Can, if configured. Default is to not retry though.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to