yep. created on driver memory. watch for OOM if the size becomes too large spark-submit --driver-memory 8G ...
HTH Dr Mich Talebzadeh, Architect | Data Science | Financial Crime | Forensic Analysis | GDPR view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> On Sun, 16 Feb 2025 at 09:16, Tim Robertson <timrobertson...@gmail.com> wrote: > Answering my own question. Global temp views get created in the > global_temp database, so can be accessed thusly. > > Thanks > > Dataset<Row> s = spark.read().parquet("/tmp/svampeatlas/*"); > s.createOrReplaceGlobalTempView("occurrence_svampe"); > spark.catalog().cacheTable("global_temp.occurrence_svampe"); > > > On Sun, Feb 16, 2025 at 10:05 AM Tim Robertson <timrobertson...@gmail.com> > wrote: > >> Hi folks >> >> Is it possible to cache a table for shared use across sessions with spark >> connect? >> I'd like to load a read only table once that many sessions will then >> query to improve performance. >> >> This is an example of the kind of thing that I have been trying, but have >> not succeeded with. >> >> SparkSession spark = >> SparkSession.builder().remote("sc://localhost").getOrCreate(); >> Dataset<Row> s = spark.read().parquet("/tmp/svampeatlas/*"); >> >> // this works if it is not "global" >> s.createOrReplaceGlobalTempView("occurrence_svampe"); >> spark.catalog().cacheTable("occurrence_svampe"); >> >> // this fails with a table not found when a global view is used >> spark >> .sql("SELECT * FROM occurrence_svampe") >> .write() >> .parquet("/tmp/export"); >> >> Thank you >> Tim >> >