From: Joseph Torres <joseph.tor...@databricks.com>
Sent: Tuesday, May 1, 2018 1:58:54 PM
To: Ryan Blue
Cc: Thakrar, Jayesh; dev@spark.apache.org
Subject: Re: Datasource API V2 and checkpointing
I agree that Spark should fully handle state serialization and re
Reader keeps a log of all the files it's seen, so its
>>>>>> offsets can be simply indices into the log rather than huge strings
>>>>>> containing all the paths.
>>>>>>
>>>>>> SPARK-23323 is orthogonal. That commit coordinator
t;>>> containing all the paths.
>>>>>
>>>>> SPARK-23323 is orthogonal. That commit coordinator is responsible for
>>>>> ensuring that, within a single Spark job, two different tasks can't commit
>>>>> the same partition.
>>
ingle Spark job, two different tasks can't commit
>>>> the same partition.
>>>>
>>>> On Fri, Apr 27, 2018 at 8:53 AM, Thakrar, Jayesh <
>>>> jthak...@conversantmedia.com> wrote:
>>>>
>>>>> Wondering if this issue is related to SPAR
Jayesh <
>>> jthak...@conversantmedia.com> wrote:
>>>
>>>> Wondering if this issue is related to SPARK-23323?
>>>>
>>>>
>>>>
>>>> Any pointers will be greatly appreciated….
>>>>
>>>>
>>&g
>>>
>>>
>>> Any pointers will be greatly appreciated….
>>>
>>>
>>>
>>> Thanks,
>>>
>>> Jayesh
>>>
>>>
>>>
>>> *From: *"Thakrar, Jayesh" <jthak...@conversantmedia.com>
&
t;> Jayesh
>>
>>
>>
>> *From: *"Thakrar, Jayesh" <jthak...@conversantmedia.com>
>> *Date: *Monday, April 23, 2018 at 9:49 PM
>> *To: *"dev@spark.apache.org" <dev@spark.apache.org>
>> *Subject: *Datasource API V2 and checkpoint
Thanks Joseph!
From: Joseph Torres <joseph.tor...@databricks.com>
Date: Friday, April 27, 2018 at 11:23 AM
To: "Thakrar, Jayesh" <jthak...@conversantmedia.com>
Cc: "dev@spark.apache.org" <dev@spark.apache.org>
Subject: Re: Datasource API V2 and c
Thanks,
>
> Jayesh
>
>
>
> *From: *"Thakrar, Jayesh" <jthak...@conversantmedia.com>
> *Date: *Monday, April 23, 2018 at 9:49 PM
> *To: *"dev@spark.apache.org" <dev@spark.apache.org>
> *Subject: *Datasource API V2 and checkpointing
>
>
>
&g
Wondering if this issue is related to SPARK-23323?
Any pointers will be greatly appreciated….
Thanks,
Jayesh
From: "Thakrar, Jayesh" <jthak...@conversantmedia.com>
Date: Monday, April 23, 2018 at 9:49 PM
To: "dev@spark.apache.org" <dev@spark.apache.or
I was wondering when checkpointing is enabled, who does the actual work?
The streaming datasource or the execution engine/driver?
I have written a small/trivial datasource that just generates strings.
After enabling checkpointing, I do see a folder being created under the
checkpoint folder, but
11 matches
Mail list logo