https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/CheckpointFileManager.scala

You would need to implement CheckpointFileManager by yourself, which is
tightly integrated with HDFS (parameters and return types of methods are
mostly from HDFS). That wouldn't mean it's impossible to
implement CheckpointFileManager against a non-filesystem, but it'd be
non-trivial to override all of the functionalities and make it work
seamlessly.

Required consistency is documented via javadoc of CheckpointFileManager -
please go through reading it, and evaluate whether your target storage can
fulfill the requirement.

Thanks,
Jungtaek Lim (HeartSaVioR)

On Mon, Sep 28, 2020 at 3:04 AM Amit Joshi <mailtojoshia...@gmail.com>
wrote:

> Hi,
>
> As far as I know, it depends on whether you are using spark streaming or
> structured streaming.
> In spark streaming you can write your own code to checkpoint.
> But in case of structured streaming it should be file location.
> But main question in why do you want to checkpoint in
> Nosql, as it's eventual consistence.
>
>
> Regards
> Amit
>
> On Sunday, September 27, 2020, Debabrata Ghosh <mailford...@gmail.com>
> wrote:
>
>> Hi,
>>     I had a query around Spark checkpoints - Can I store the checkpoints
>> in NoSQL or Kafka instead of Filesystem ?
>>
>> Regards,
>>
>> Debu
>>
>

Reply via email to