mtsadler-branch commented on issue #38681:
URL: https://github.com/apache/airflow/issues/38681#issuecomment-2034816070
To reproduce the error, I have the following PyPi packages installed:
```
apache-airflow==2.8.4
apache-airflow-providers-snowflake==5.3.1
pandas==1.3.5
pyarrow==10.0.1
snowflake-connector-python==3.5.0
```
And ran the following code:
```Python3
from airflow.providers.snowflake.hooks.snowflake import SnowflakeHook
import pandas as pd
from snowflake.connector.pandas_tools import pd_writer
hook = SnowflakeHook(
# Add SF connection details here...
)
# Create table
temp_sf = {
"database": "test",
"schema": "scratch",
"table": "temp_table",
}
table_name = f"{temp_sf['database']}.{temp_sf['schema']}.{temp_sf['table']}"
create_table_query = f"""
CREATE OR REPLACE TABLE {table_name}
(COL1 INT, COL2 TIMESTAMP_NTZ(9)) AS
SELECT 1, '2021-01-01T01:00:00.000000000'::timestamp_ntz
"""
results = hook.get_pandas_df(create_table_query)
# Append new data
new_data = pd.DataFrame({
"COL1": [4, 5, 6],
"COL2": [
'2021-01-04T04:00:00.000000000',
'2021-01-05T05:00:00.000000000',
'2021-01-06T06:00:00.000000000',
],
})
engine = hook.get_sqlalchemy_engine()
with engine.connect() as conn:
# Regardless of whether pyarrow is installed, this will append data to
Snowflake table
# However, if pyarrow isn't installed, then COL2 will have invalid
timestamps
new_data.to_sql(
name=table_name,
con=conn,
if_exists="append",
index=False,
method=pd_writer,
)
# If pyarrow is installed, this will return the correct data
# If pyarrow isn't installed, this will error: Timestamp
'(seconds_since_epoch=1712074200000000000)' is not recognized
results = hook.get_pandas_df(f"SELECT * FROM {table_name}")
print(results)
```
**Results:**
You'll see
`DataFrame.to_sql(con=SnowflakeHook.get_sqlalchemy_engine().connect())` works
as expected with `pyarrow==10.0.1`.
But when `pyarrow` isn't installed, the data written to `COL2` isn't the
correct format for `Timestamp()`, which causes an error when trying to query
`COL2`.
**Open Questions:**
Is there a different pattern for inserting/appending data to an existing
table, which doesn't require `pyarrow`?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]