Is it correct to say, the nodes in the DAG are RDDs and the edges are
computations?

On Thu, Apr 16, 2020 at 6:21 PM Reynold Xin <r...@databricks.com> wrote:

> The RDD is the DAG.
>
>
> On Thu, Apr 16, 2020 at 3:16 PM, Mania Abdi <abdi...@husky.neu.edu> wrote:
>
>> Hello everyone,
>>
>> I am implementing a caching mechanism for analytic workloads running on
>> top of Spark and I need to retrieve the Spark DAG right after it is
>> generated and the DAG scheduler. I would appreciate it if you could give me
>> some hints or reference me to some documents about where the DAG is
>> generated and inputs assigned to it. I found the DAG Scheduler class
>> <https://github.com/apache/spark/blob/55dea9be62019d64d5d76619e1551956c8bb64d0/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala>
>> but I am not sure if it is a good starting point.
>>
>> Regards
>> Mania
>>
>
>

Reply via email to