datapythonista commented on PR #750: URL: https://github.com/apache/datafusion-python/pull/750#issuecomment-2223182914
I've been doing some research, and I think most of the usability improvements I suggested above I think they should be better done in the Rust/PyO3 code, not in Python. For example, making Python objects literals by default (e.g. `col("age") + 1` being equivalent to `col("age") + lit(1)`) should be quite easy by implementing a trait with a single function in PyO3. The `expr.cast(float)` would be pretty much the same. In PyO3 there is already a decent amount of Python logic. For example the implementation of dunder methods like `__add__` to support operators like `col("age") + lit(1)`. I agree with your point that it doesn't make a significant difference to have the docs in Python or in PyO3 since both are wrappers of the actual implementation in DataFusion itself. So, in my opinion, the docs and the typing validation don't seem to make a big difference whether they are implemented in Python or Rust/PyO3. The extended logic to make the API more Pythonic, I personally thing keeping it in Rust is an advantage, since there is already a decent amount there, doesn't seem much more complicated to implement them in Rust (at least the ones discussed until now). And I think it will make it complicated to know what belongs to with side Python/Rust if we start implementing some in Python. Unless you have a clear criteria for that (I personally don't). So, the main thing to me seems to be: - Having Python wrappers helps users checking the code get an intuition of what's available (personally I think if there are good API docs this is not important, but maybe for other users it does make a difference). - Not having the Python wrappers means a lot less code to maintain, code that needs to be in sync with the PyO3 API. - I assume certain features will be faster / easier to implement in Python than in Rust/PyO3, particularly things that are complex. I don't thing we have any identified yet, right? I assume implementing the named struct in PyO3 wouldn't be too complex, no? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org