from:"\"Bernard Jesop\""

Re: Dataset API Question

2017-10-25 Thread Bernard Jesop

Actually, I realized keeping the info would not be enough as I need to find back the checkpoint files to delete them :/ 2017-10-25 19:07 GMT+02:00 Bernard Jesop : > As far as I understand, Dataset.rdd is not the same as InternalRDD. > It is just another RDD representation of the same Datas

Re: Dataset API Question

2017-10-25 Thread Bernard Jesop

been checkpointed? Should I manually keep track of that info? 2017-10-25 15:51 GMT+02:00 Bernard Jesop : > Hello everyone, > > I have a question about checkpointing on dataset. > > It seems in 2.1.0 that there is a Dataset.checkpoint(), however unlike RDD > there is no Data

Dataset API Question

2017-10-25 Thread Bernard Jesop

Hello everyone, I have a question about checkpointing on dataset. It seems in 2.1.0 that there is a Dataset.checkpoint(), however unlike RDD there is no Dataset.isCheckpointed(). I wonder if Dataset.checkpoint is a syntactic sugar for Dataset.rdd.checkpoint. When I do : Dataset.checkpoint; Data

Re: underlying checkpoint

2017-07-13 Thread Bernard Jesop

false > > scala> df.show() > +--+---+ > |_1| _2| > +--+---+ > | Scala| 35| > |Python| 30| > | R| 15| > | Java| 20| > +--+---+ > > > scala> df.rdd.isCheckpointed > res4: Boolean = false > > scala> df.rdd.count() > res5: Long =

underlying checkpoint

2017-07-13 Thread Bernard Jesop

nted"*. Do you have any idea why? (knowing that the checkpoint file is created) Best regards, Bernard JESOP

Re: Analysis Exception after join

2017-07-04 Thread Bernard Jesop

It seems to be because of this issues: https://issues.apache.org/jira/browse/SPARK-10925 I added a checkpoint, as suggested, to break the lineage and it worked. Best regards, 2017-07-04 17:26 GMT+02:00 Bernard Jesop : > Thank Didac, > > My bad, actually this code is incomplete, it sh

Re: Analysis Exception after join

2017-07-04 Thread Bernard Jesop

y S_ID. > > I guess that you are looking for something more like the following example > dfAgg = df.groupBy("S_ID”) >.agg(org.apache.spark.sql.functions.count(*“userName"*).as( > *“usersCount**”*), > .agg(org.apache.spark.sql.functions.collect_set(“city") > .as("List

Analysis Exception after join

2017-07-03 Thread Bernard Jesop

Hello, I don't understand my error message. Basically, all I am doing is : - dfAgg = df.groupBy("S_ID") - dfRes = df.join(dfAgg, Seq("S_ID"), "left_outer") However I get this AnalysisException: " Exception in thread "main" org.apache.spark.sql.AnalysisException: resolved attribute(s) S_ID#1903L m

Re: Dataset API Question

Re: Dataset API Question

Dataset API Question

Re: underlying checkpoint

underlying checkpoint

Re: Analysis Exception after join

Re: Analysis Exception after join

Analysis Exception after join

8 matches

Site Navigation

Mail list logo

Footer information