gene-bordegaray commented on PR #21105: URL: https://github.com/apache/datafusion/pull/21105#issuecomment-4150005635
> I got sick so fell off looking at this. I think this looks great for a first pass and we should push to wikipedia to see what the reviewers say. One note that I don't know if I have time for is that this seems to slightly over emphasize the extensibility perspective. > > On a quick read through I would assume this was only for building the infrastructure and could easily miss the SQL/dataframe API bits. At rerun I use datafusion (specifically datafusion-python) quite heavily but don't really know the details about our table provider (since other people build that bit). I suspect our customers will also hit this page since we generate examples for the DataFrame API in python (and are generating more SQL examples). https://rerun.io/docs/howto/query-and-transform/dataframe_operations > > Mostly just food for thought that there might be two distinct audiences interested in this page. People who build on datafusion and those who build data products using datafusion top level APIs. (I still think landing the page first makes sense then I or someone else can potentially try to add a section for more SQL/DataFrame API details) Ya I definitely think it leans toward teh infrastructure sie of things as it stands ( this is what I used DF for so guilty for that 😅 ). I agree that getting something up and someone with more expereinces using DF for the Dataframes / SQL aspects can step in and add what they see fit. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
