zhongjiajie commented on a change in pull request #4829: [AIRFLOW-3993] Change
SalesforceHook and add tests
URL: https://github.com/apache/airflow/pull/4829#discussion_r268390414
##########
File path: airflow/contrib/hooks/salesforce_hook.py
##########
@@ -174,93 +165,83 @@ def _to_timestamp(cls, col):
# if the column cannot be converted,
# just return the original column untouched
try:
- col = pd.to_datetime(col)
+ column = pd.to_datetime(column)
except ValueError:
log = LoggingMixin().log
- log.warning(
- "Could not convert field to timestamps: %s", col.name
- )
- return col
+ log.warning("Could not convert field to timestamps: %s",
column.name)
+ return column
# now convert the newly created datetimes into timestamps
# we have to be careful here
# because NaT cannot be converted to a timestamp
# so we have to return NaN
converted = []
- for i in col:
+ for value in column:
try:
- converted.append(i.timestamp())
- except ValueError:
- converted.append(pd.np.NaN)
- except AttributeError:
+ converted.append(value.timestamp())
+ except (ValueError, AttributeError):
converted.append(pd.np.NaN)
- # return a new series that maintains the same index as the original
- return pd.Series(converted, index=col.index)
-
- def write_object_to_file(
- self,
- query_results,
- filename,
- fmt="csv",
- coerce_to_timestamp=False,
- record_time_added=False
- ):
+ return pd.Series(converted, index=column.index)
+
+ def write_object_to_file(self,
+ query_results,
+ filename,
+ fmt="csv",
+ coerce_to_timestamp=False,
+ record_time_added=False):
"""
Write query results to file.
Acceptable formats are:
- csv:
- comma-separated-values file. This is the default format.
+ comma-separated-values file. This is the default format.
- json:
- JSON array. Each element in the array is a different row.
+ JSON array. Each element in the array is a different row.
- ndjson:
- JSON array but each element is new-line delimited
- instead of comma delimited like in `json`
+ JSON array but each element is new-line delimited instead of
comma delimited like in `json`
This requires a significant amount of cleanup.
Pandas doesn't handle output to CSV and json in a uniform way.
This is especially painful for datetime types.
- Pandas wants to write them as strings in CSV,
- but as millisecond Unix timestamps.
-
- By default, this function will try and leave all values as
- they are represented in Salesforce.
- You use the `coerce_to_timestamp` flag to force all datetimes
- to become Unix timestamps (UTC).
- This is can be greatly beneficial as it will make all of your
- datetime fields look the same,
+ Pandas wants to write them as strings in CSV, but as millisecond Unix
timestamps.
+
+ By default, this function will try and leave all values as they are
represented in Salesforce.
+ You use the `coerce_to_timestamp` flag to force all datetimes to
become Unix timestamps (UTC).
+ This is can be greatly beneficial as it will make all of your datetime
fields look the same,
and makes it easier to work with in other database environments
- :param query_results: the results from a SQL query
- :param filename: the name of the file where the data
- should be dumped to
- :param fmt: the format you want the output in.
- *Default:* csv.
- :param coerce_to_timestamp: True if you want all datetime fields to be
- converted into Unix timestamps.
- False if you want them to be left in the
- same format as they were in Salesforce.
- Leaving the value as False will result
- in datetimes being strings.
+ :param query_results: the results from a SQL query
+ :type query_results: list of dict
+ :param filename: the name of the file where the data should be dumped
to
+ :type filename: str
+ :param fmt: the format you want the output in.
+ *Default:* csv.
Review comment:
and below.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services