Hi, Recently I had to optimize few Apache Spark SQL queries. Some of the Datasets were reused, so they were cached. However after caching I don't see SQL Visualization for the cached Dataset in Spark UI - I see only InMemoryRelation node. Explain result at the bottom of the page still has full plan.
Is this an expected behaviour? In such cases we have much less options to debug performance in Spark. My suggestion is to show full diagram on the first action after cache or to show separate SQL query for cache - second option however probably is not possible as cache does not trigger calculation, so we can't get metrics. Workaround is to temporairly disable caching, but it consumes much time to do it, especially on large datasets Pozdrawiam / Best regards, Tomek