ryansun96 commented on issue #1354:
URL:
https://github.com/apache/datafusion-python/issues/1354#issuecomment-3808653833
Hi @timsaucer thanks for your reply! Happy to share more context:
We are rewriting a Python system in rust, in a phased approach. In
simplified terms, our current system looks like this:
```python
class BaseCls:
def method1(df: pd.DataFrame):
pass #Some pandas logic
def method2():
pass
```
To make the change as transparent as possible, in Phase 1, we plan to
implement certain compute and memory intensive operations in rust, using
datafusion, exposed via PyO3 bindings. In other words, what we are trying to do
is:
```python
class BaseCls:
def method1(df: pd.DataFrame): #Now a native method implemented in Rust,
using datafusion for various data frame operations
def method2():
pass # Same as before, implemented in Py
```
Thus, we would like to access & manipulate the dataframe from rust, rather
than using the exposed py bindings. We will implement rust UDFs, object store
registries, etc., without using those items from py at all.
What is your recommended approach for this?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]