Hi Vibhor, We worked on a project to create lucene indexes using spark but the project has not been managed for some time now. If there is interest we can resurrect it
https://github.com/vsumanth10/trapezium/blob/master/dal/src/test/scala/com/verizon/bda/trapezium/dal/lucene/LuceneIndexerSuite.scala https://www.databricks.com/session/fusing-apache-spark-and-lucene-for-near-realtime-predictive-model-building After lucene indexes were created we uploaded it to solr for search ui. We did not ingest it to elastisearch though. Our scale was 100m+ rows and 100k+ columns, spark + lucene worked fine Thank you. Deb On Wed, Nov 9, 2022, 10:13 AM Vibhor Gupta <vibhor.gu...@walmart.com.invalid> wrote: > Hi Spark Community, > > Is there a way to create elastic indexes offline and then import them to > an elastic cluster ? > We are trying to load an elastic index with around 10B documents (~1.5 to > 2 TB data) using spark daily. > > I know elastic provides a snapshot restore functionality through > GCS/S3/Azure, but is there a way to generate this snapshot offline using > spark ? > > Thanks, > Vibhor Gupta >