I want to find how to speed up SparkSQL using Ignite (in particular I am wondering if it can be a replacement for Presto and my files are parquet format in S3).
Question: Reading from the links in https://ignite.apache.org/use-cases/spark/sql-queries.html, is it true (in my case) I need to pre-load data to Ignite first (by loading my S3 files to Spark as dataframe and then writing the dataframe to Ignite Dataframe), then I can run sql against the Ignite DataFrame with the benefit of indexing? Therefore, for any data I want to query using SparkSQL, I will need to pre-load them into Ignite first explicitly like that? It maybe difficult to anticipate what date range of data users want to query and to pre-load them, I am hoping for a seemless way to boost my SparkSQL queries. Any links/reference will be appreciated :) Anthony Mak -- Sent from: http://apache-ignite-users.70518.x6.nabble.com/
