westonpace commented on code in PR #14516:
URL: https://github.com/apache/arrow/pull/14516#discussion_r1005808004
##########
python/pyarrow/tests/test_dataset.py:
##########
@@ -491,6 +491,23 @@ def test_scanner(dataset, dataset_reader):
assert sorted_table['__last_in_fragment'].to_pylist() == [True] * 10
[email protected]
+def test_scanner_memory_pool(dataset):
+ # honor default pool - https://issues.apache.org/jira/browse/ARROW-18164
+ old_pool = pa.default_memory_pool()
+ # pool = pa.proxy_memory_pool(old_pool)
+ pool = pa.system_memory_pool()
Review Comment:
My guess is that the problem is that the scanner erroneously returns before
all work is completely finished. Changing the thread pool or the memory pool
too quickly after a scan can lead to this kind of error. The new scanner was
created specifically to avoid this problem but it isn't the default yet (still
working through some follow-up PRs to make sure we have the same functionality).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]