Hi Vibhor,

We worked on a project to create lucene indexes using spark but the project
has not been managed for some time now. If there is interest we can
resurrect it

https://github.com/vsumanth10/trapezium/blob/master/dal/src/test/scala/com/verizon/bda/trapezium/dal/lucene/LuceneIndexerSuite.scala
https://www.databricks.com/session/fusing-apache-spark-and-lucene-for-near-realtime-predictive-model-building

After lucene indexes were created we uploaded it to solr for search ui. We
did not ingest it to elastisearch though.

Our scale was 100m+ rows and 100k+ columns, spark + lucene worked fine

Thank you.
Deb


On Wed, Nov 9, 2022, 10:13 AM Vibhor Gupta <vibhor.gu...@walmart.com.invalid>
wrote:

> Hi Spark Community,
>
> Is there a way to create elastic indexes offline and then import them to
> an elastic cluster ?
> We are trying to load an elastic index with around 10B documents (~1.5 to
> 2 TB data) using spark daily.
>
> I know elastic provides a snapshot restore functionality through
> GCS/S3/Azure, but is there a way to generate this snapshot offline using
> spark ?
>
> Thanks,
> Vibhor Gupta
>

Reply via email to