+1

Understanding the close relationship between SQL and DataFrames in Spark was a 
key learning moment for me, but I agree that using the terms interchangeably 
can be confusing.


> On Mar 27, 2022, at 9:27 PM, Hyukjin Kwon <gurwls...@gmail.com> wrote:
> 
> *for some reason, the image looks broken (to me). I am attaching again to 
> make sure.
> 
> <Screen Shot 2022-03-25 at 12.18.14 PM.png>
> 
> On Mon, 28 Mar 2022 at 10:22, Hyukjin Kwon <gurwls...@gmail.com 
> <mailto:gurwls...@gmail.com>> wrote:
> Hi all,
> 
> I have been investigating the improvements for Pandas API on Spark 
> specifically in UI.
> I chatted with a couple of people, and decided to send an email here to 
> discuss more.
> 
> Currently, both SQL and DataFrame API are shown in “SQL” tab as below:
> 
> 
> 
> which makes sense to developers because DataFrame API shares the same SQL 
> core but
> I do believe this makes less sense to end users. Please consider two more 
> points:
> 
> Spark ML users will run DataFrame-based MLlib API, but they will have to 
> check the "SQL" tab.
> Pandas API on Spark arguably has no link to SQL itself conceptually. It makes 
> less sense to users of pandas API.
> 
> So I would like to propose to rename:
> "SQL" to "SQL/DataFrame"
> "Query" to "Execution"
> 
> There's a PR open at https://github.com/apache/spark/pull/35973 
> <https://github.com/apache/spark/pull/35973>. Please let me know your 
> thoughts on this. 
> 
> Thanks.

Reply via email to