Owen-CH-Leung commented on issue #39323:
URL: https://github.com/apache/airflow/issues/39323#issuecomment-2298184697

   In case it may help: I'm using simple curl command to upload the raw json 
logs to Elasticsearch (instead of filebeat) : 
   
   ```
   #/bin/bash
   
   while IFS= read -r line
   do
     curl -X POST "http://localhost:9200/airflow-dag/_doc/"; -H 'Content-Type: 
application/json' -d "$line"
   done < [your DAG task log file]
   ```
   
   
   Some important configs in `airflow.cfg` : 
   
   ```
   remote_logging = True
   remote_log_conn_id = elasticsearch_default
   
   [elasticsearch] 
   # Elasticsearch host  
   host = [your ES host]
   json_format = True
   offset_field = log
   ```
   
   the raw task log file should look like the following: 
   ```
   {"asctime": "2024-08-20T06:52:42.617+0000", "filename": 
"local_task_job_runner.py", "lineno": 123, "levelname": "INFO", "message": 
"::group::Pre task execution logs", "log": 1724136762617922560, "dag_id": 
"elasticsearch_sql_dag", "task_id": "run_es_query", "execution_date": 
"2024_08_20T06_52_41_314339", "try_number": "1", "log_id": 
"elasticsearch_sql_dag-run_es_query-manual__2024-08-20T06:52:41.314339+00:00--1-1"}
   ```
   
   If you are using tools like filebeat, be aware that not all keys in your raw 
log file will be attached as a field in a document inside ES. No matter what 
tool you use to write logs to ES, you need to make sure that a single document 
in ES looks like this: 
   
   ```
   {
           "_index": "airflow-dag",
           "_id": "o_ysbpEBEcTOdHp1B7dC",
           "_score": 1,
           "_source": {
             "asctime": "2024-08-20T07:19:54.376+0000",
             "filename": "logging_mixin.py",
             "lineno": 190,
             "levelname": "INFO",
             "message": "row: ['docker-cluster', 
'.alerts-observability.slo.alerts-default', 'VIEW', 'ALIAS']",
             "log": 1724138394376478200,
             "dag_id": "elasticsearch_sql_dag",
             "task_id": "run_es_query",
             "execution_date": "2024_08_20T07_19_51_268943",
             "try_number": "1",
             "log_id": 
"elasticsearch_sql_dag-run_es_query-manual__2024-08-20T07:19:51.268943+00:00--1-1"
           }
         }
   ```
   
   Make sure there're fields `log` and `log_id`. 
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to