Thanks Steve to answer in detail. I was under same feeling with Chandan
from the line as well: it was against my knowledge as rename operation
itself in HDFS is atomic, and I didn't imagine it was for tackling object
store.
I learned a lot for object store from your answer. Thanks again.
Jungtaek
Thanks a lot Steve and Jungtaek for your answers.
Steve,
You explained really well in depth.
I understood that the existing old implementation was not correct for
object store like S3. The new implementation will address that. And for
better performance we should better choose a Direct Write base
On 11 Aug 2018, at 17:33, chandan prakash
mailto:chandanbaran...@gmail.com>> wrote:
Hi All,
I was going through this pull request about new CheckpointFileManager
abstraction in structured streaming coming in 2.4 :
https://issues.apache.org/jira/browse/SPARK-23966
https://github.com/apache/spar
Removing user@ since cross-posting multiple mailing lists are considered as
not-good practice.
My knowledge is based on the codebase after SPARK-23966, so I'm reading
SPARK-23966 back and try to explain what I can see in the patch. Anyone
please correct me if I'm missing here.
You may want to not
Anyone who can clear doubts on the questions asked here ?
Regards,
Chandan
On Sat, Aug 11, 2018 at 10:03 PM chandan prakash
wrote:
> Hi All,
> I was going through this pull request about new CheckpointFileManager
> abstraction in structured streaming coming in 2.4 :
> https://issues.apache.or
Hi All,
I was going through this pull request about new CheckpointFileManager
abstraction in structured streaming coming in 2.4 :
https://issues.apache.org/jira/browse/SPARK-23966
https://github.com/apache/spark/pull/21048
I went through the code in detail and found it will indtroduce a very nice