[
https://issues.apache.org/jira/browse/AIRFLOW-4363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16853642#comment-16853642
]
ASF GitHub Bot commented on AIRFLOW-4363:
-----------------------------------------
benbenbang commented on pull request #5356: [AIRFLOW-4363] Fix Json encoding
error when retrieving `status` from cli in docker operator
URL: https://github.com/apache/airflow/pull/5356
Make sure you have checked _all_ steps below.
### Jira
- [x] My PR addresses the following [Airflow
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references
them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
- https://issues.apache.org/jira/browse/AIRFLOW-XXX
- In case you are fixing a typo in the documentation you can prepend your
commit with \[AIRFLOW-XXX\], code changes always need a Jira issue.
- In case you are proposing a fundamental code change, you need to create
an Airflow Improvement Proposal
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)).
- In case you are adding a dependency, check if the license complies with
the [ASF 3rd Party License
Policy](https://www.apache.org/legal/resolved.html#category-x).
[Update]: AIRFLOW-4363
### Description
- [x] Here are some details about my PR, including screenshots of any UI
changes:
Issue: When using the `docker_operator`, I experienced some issue while
using Mac OS 10.14.4. The error was `json.JSONDecodeError`. After my
investigation about this error, I found that there are several messages for
logging aren't well separated, for example it contains `\n` inside one single
message which should be split into 2 to more different messages.
What my PR do: add try-catch for reading to json, if it encounter
`json.JSONDecodeError` again, we should either split by `\n` and then find
`status` to be logged or just print all the payload instead of throw an error
just due to an non-critical issue like this.
### Tests
- [] My PR adds the following unit tests __OR__ does not need testing for
this extremely good reason:
### Commits
- [ ] My commits all reference Jira issues in their subject lines, and I
have squashed multiple commits if they address the same issue. In addition, my
commits follow the guidelines from "[How to write a good git commit
message](http://chris.beams.io/posts/git-commit/)":
1. Subject is separated from body by a blank line
1. Subject is limited to 50 characters (not including Jira issue reference)
1. Subject does not end with a period
1. Subject uses the imperative mood ("add", not "adding")
1. Body wraps at 72 characters
1. Body explains "what" and "why", not "how"
### Documentation
- [ ] In case of new functionality, my PR adds documentation that describes
how to use it.
- All the public functions and the classes in the PR contain docstrings
that explain what it does
- If you implement backwards incompatible changes, please leave a note in
the [Updating.md](https://github.com/apache/airflow/blob/master/UPDATING.md) so
we can assign it to a appropriate release
### Code Quality
- [x] Passes `flake8`
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
> Encounter JSON Decode Error when using docker operator
> ------------------------------------------------------
>
> Key: AIRFLOW-4363
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4363
> Project: Apache Airflow
> Issue Type: Bug
> Environment: - Mac OS 10.14
> - python 3.6.8
> - airflow 1.10.2
> Reporter: Ben Chen
> Assignee: Ben Chen
> Priority: Blocker
>
> *[Description]* .
> When using the docker_operator, I experienced some issue while using Mac OS
> 10.14.4. The error was json.JSONDecodeError. After my investigation about
> this error, I found that there are several messages for logging aren't well
> separated, for example it contains \n inside one single message which should
> be split into 2 to more different messages.
> *[Update]*
> Confirmed that issue came from the implementation in airflow, issue cannot be
> solved by just passing `decode` to parameter in docker.pull method in docker
> api.
> *[Solution]* .
> For now, I use try-catch to run the original implementation, and in the
> exception part I split the message to list and then parse it. Looking for
> simpler solution to this non critical but still blocking point.
> *[Logs]*
> {docker_operator.py:188}
> INFO - Starting docker container from image hello-world
> {docker_operator.py:202}
> INFO - Pulling docker image hello-world
> {docker_operator.py:207}
> INFO - Pulling from library/hello-world
> {docker_operator.py:207}
> INFO - Pulling fs layer
> {docker_operator.py:207}
> INFO - Downloading
> {docker_operator.py:207}
> INFO - Downloading
> {docker_operator.py:207}
> INFO - Download complete
> {docker_operator.py:207}
> INFO - Extracting
> {docker_operator.py:207}
> INFO - Extracting
> {docker_operator.py:207}
> INFO - Pull complete
> {docker_operator.py:207}
> INFO - Digest:
> sha256:92695bc579f31df7a63da6922075d0666e565ceccad16b59c3374d2cf4e8e50e
> {docker_operator.py:207}
> INFO - Pulling from library/hello-world
> {docker_operator.py:207}
> INFO - Digest:
> sha256:1a67c1115b199aa9d964d5da5646917cbac2d5450c71a1deed7b1bfb79c2c82d
> {models.py:1788}
> ERROR - Extra data: line 2 column 1 (char 70)
> Traceback (most recent call last):
> line 1657, in _run_raw_task, result = task_copy.execute(context=context)
> line 205, in execute output = json.loads(line)
> line 354, in loads, return _default_decoder.decode(s)
> line 342, in decode, raise JSONDecodeError("Extra data", s, end)
> json.decoder.JSONDecodeError: Extra data: line 2 column 1 (char 70)
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)