aab200 opened a new issue, #1283:
URL: https://github.com/apache/arrow-adbc/issues/1283
Hello,
Was trying the adbc python snow driver and I see something odd . Simple
function
```
# Query via adbc
def query_via_adbc_arrow(conn_adbc, query, return_type='arrow'):
print(f"Enters query_via_adbc_arrow ")
start_time = datetime.datetime.now()
cursor = conn_adbc.cursor()
cursor.execute(query)
# Return pandas or arrow
if return_type == 'arrow':
df = cursor.fetch_arrow_table()
num_rows = df.num_rows
else:
df = cursor.fetch_df()
#df = cursor.fetch_arrow_table().to_pandas()
num_rows = df.shape[0]
cursor.close()
end_time = datetime.datetime.now()
execution_time = (end_time - start_time).total_seconds()
print(f" Got {num_rows} rows in {return_type} . The execution time is
{execution_time} secs ")
print(f"Exits query_via_adbc_arrow ")
```
Sometimes it works and some times it does not ( the regular snowflake python
connector works every time )
When we run adbc part there are weird errors
(310) [aborissov@bhsys-data-dev-euw1c-lnx10 python_project2]$ python3
snow_python_adbc.py
Enters query_via_adbc_arrow
Got 10003143 rows in arrow . The execution time is 1.62057 secs
Exits query_via_adbc_arrow
Enters query_via_adbc_arrow
Traceback (most recent call last):
File "/<>/python_project2/snow_python_adbc.py", line 133, in <module>
df = query_via_adbc_arrow(conn_adbc, "select * from
private.main_td_limit_orders", return_type='pandas')
File "/bh<>/snow_python_adbc.py", line 91, in query_via_adbc_arrow
df = cursor.fetch_df() # Does not work well with large data running out
of memory
File
"/<>/venvs/310/lib/python3.10/site-packages/adbc_driver_manager/dbapi.py", line
1050, in fetch_df
return self._results.fetch_df()
File
"<>/venvs/310/lib/python3.10/site-packages/adbc_driver_manager/dbapi.py", line
1139, in fetch_df
return self._reader.read_pandas()
File "adbc_driver_manager/_reader.pyx", line 108, in
adbc_driver_manager._reader.AdbcRecordBatchReader.read_pandas
File "adbc_driver_manager/_reader.pyx", line 40, in
adbc_driver_manager._reader._AdbcErrorHelper.check_error
adbc_driver_manager.OperationalError: UNKNOWN: [Snowflake] arrow/ipc:
unknown error while reading: cannot allocate memory
(310) [<>python_project2]$
(310) [<> python_project2]$
(310) [<> python_project2]$
(310) [<> python_project2]$
(310) [<> python_project2]$
(310) [<>python_project2]$ python3 snow_python_adbc.py
Enters query_via_adbc_arrow
Got 10003143 rows in arrow . The execution time is 1.535479 secs
Exits query_via_adbc_arrow
Enters query_via_adbc_arrow
Traceback (most recent call last):
File "/<>/python_project2/python_project2/snow_python_adbc.py", line 133,
in <module>
df = query_via_adbc_arrow(conn_adbc, "select * from
private.main_td_limit_orders", return_type='pandas')
File "/<>/python_project2/python_project2/snow_python_adbc.py", line 91,
in query_via_adbc_arrow
df = cursor.fetch_df() # Does not work well with large data running out
of memory
File
"/<>/venvs/310/lib/python3.10/site-packages/adbc_driver_manager/dbapi.py", line
1050, in fetch_df
return self._results.fetch_df()
File
"<>/venvs/310/lib/python3.10/site-packages/adbc_driver_manager/dbapi.py", line
1139, in fetch_df
return self._reader.read_pandas()
File "adbc_driver_manager/_reader.pyx", line 108, in
adbc_driver_manager._reader.AdbcRecordBatchReader.read_pandas
File "adbc_driver_manager/_reader.pyx", line 40, in
adbc_driver_manager._reader._AdbcErrorHelper.check_error
adbc_driver_manager.OperationalError: UNKNOWN: [Snowflake] arrow/ipc:
unknown error while reading: cannot allocate memory
(310) [<> python_project2]$
(310) [<> python_project2]$
(310) [<> python_project2]$
(310) [<> python_project2]$
(310) [<> python_project2]$ python3 snow_python_adbc.py
Enters query_via_adbc_arrow
Got 10003143 rows in arrow . The execution time is 1.560988 secs
Exits query_via_adbc_arrow
Enters query_via_adbc_arrow
Got 10003143 rows in pandas . The execution time is 2.706668 secs
Exits query_via_adbc_arrow
(310) [<> python_project2]$
Has anyone experienced that ? Thanks
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]