alamb opened a new issue, #15915:
URL: https://github.com/apache/datafusion/issues/15915

   ### Is your feature request related to a problem or challenge?
   
   - Part of https://github.com/apache/datafusion/issues/15914
   
   @shehabgamin added the `datafusion-spark` crate in 
https://github.com/apache/datafusion/pull/15168
   
   The idea is that using the functions in this crate you can get a 
`SessionContext` that executes sql using Spark semantics. However, there is no 
user facing documentation that shows how to do this
   
   ### Describe the solution you'd like
   
   Add an example somewhere showing how to configure and use the spark 
functions in a SessionContext. I can help with this
   
   
   ### Describe alternatives you've considered
   
   I personally suggest adding a new page to the website: 
https://datafusion.apache.org/
   
   Specifically, I suggest
   1. Add a new page in the "Library User Guide" called "Spark Compatible 
Functions"
   2. Add a preamble explaining what the datafusion-spark crate is (contains a 
list of spark compatible functions)
   3. Add examples
   
   For example we should show how to run sql using a "spark compatible" frame:
   ```rust
   let ctx = SessionContext::new();
   datafusion_spark::register_all(&ctx)?;
   
   // TODO run an example SQL query here that uses a function from 
   // the datafusion spark crate
   ctx.sql("select ... ")
   
   // also add an example for DataFrame API
   ```
   
   
   In order to run the example code as part of CI, you will have to add an 
entry such as this:
   
https://github.com/apache/datafusion/blob/81b4c074886e604277cc303986113ca12f9ac77d/datafusion/core/src/lib.rs#L928-L932
   
   
   to the [datafusion-spark 
lib.rs](https://github.com/apache/datafusion/blob/6bda4796a3c8142b87d0fad072bdbe25f9d12934/datafusion/spark/src/lib.rs#L16-L15)
 file (it can't go in the datafusion/core/lib.rs because the core crate doesn't 
bring in datafusion-spark)
   
   
   ### Additional context
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to