Re: Lineage between Datasets

2017-04-12 Thread Chang Chen
age". You can get that by > calling explain(true) and look at the analyzed plan. > > > On Wed, Apr 12, 2017 at 3:03 AM Chang Chen <baibaic...@gmail.com> wrote: > >> Hi All >> >> I believe that there is no lineage between datasets. Consider this case: >

Re: Lineage between Datasets

2017-04-12 Thread Reynold Xin
i All > > I believe that there is no lineage between datasets. Consider this case: > > val people = spark.read.parquet("...").as[Person] > > val ageGreatThan30 = people.filter("age > 30") > > Since the second DS can push down the condition, they are obvious

Lineage between Datasets

2017-04-12 Thread Chang Chen
Hi All I believe that there is no lineage between datasets. Consider this case: val people = spark.read.parquet("...").as[Person] val ageGreatThan30 = people.filter("age > 30") Since the second DS can push down the condition, they are obviously different logical plans