Well, there is foreach for Java and another foreach for Scala. That's what I can understand. But while supporting two language-specific APIs -- Scala and Java -- Dataset API lost support for such simple calls without type annotations so you have to be explicit about the variant (since I'm using Scala I want to use Scala API right). It appears that any single-argument-function operators in Datasets are affected :(
My question was to know whether there are works to fix it (if possible -- I don't know if it is). Pozdrawiam, Jacek Laskowski ---- https://medium.com/@jaceklaskowski/ Mastering Apache Spark http://bit.ly/mastering-apache-spark Follow me at https://twitter.com/jaceklaskowski On Tue, Jul 5, 2016 at 4:21 PM, Sean Owen <[email protected]> wrote: > Right, should have noticed that in your second mail. But foreach > already does what you want, right? it would be identical here. > > How these two methods do conceptually different things on different > arguments. I don't think I'd expect them to accept the same functions. > > On Tue, Jul 5, 2016 at 3:18 PM, Jacek Laskowski <[email protected]> wrote: >> ds is Dataset and the problem is that println (or any other >> one-element function) would not work here (and perhaps other methods >> with two variants - Java's and Scala's). >> >> Pozdrawiam, >> Jacek Laskowski >> ---- >> https://medium.com/@jaceklaskowski/ >> Mastering Apache Spark http://bit.ly/mastering-apache-spark >> Follow me at https://twitter.com/jaceklaskowski >> >> >> On Tue, Jul 5, 2016 at 3:53 PM, Sean Owen <[email protected]> wrote: >>> A DStream is a sequence of RDDs, not of elements. I don't think I'd >>> expect to express an operation on a DStream as if it were elements. >>> >>> On Tue, Jul 5, 2016 at 2:47 PM, Jacek Laskowski <[email protected]> wrote: >>>> Sort of. Your example works, but could you do a mere >>>> ds.foreachPartition(println)? Why not? What should I even see the Java >>>> version? >>>> >>>> scala> val ds = spark.range(10) >>>> ds: org.apache.spark.sql.Dataset[Long] = [id: bigint] >>>> >>>> scala> ds.foreachPartition(println) >>>> <console>:26: error: overloaded method value foreachPartition with >>>> alternatives: >>>> (func: >>>> org.apache.spark.api.java.function.ForeachPartitionFunction[Long])Unit >>>> <and> >>>> (f: Iterator[Long] => Unit)Unit >>>> cannot be applied to (Unit) >>>> ds.foreachPartition(println) >>>> ^ >>>> >>>> Pozdrawiam, >>>> Jacek Laskowski >>>> ---- >>>> https://medium.com/@jaceklaskowski/ >>>> Mastering Apache Spark http://bit.ly/mastering-apache-spark >>>> Follow me at https://twitter.com/jaceklaskowski >>>> >>>> >>>> On Tue, Jul 5, 2016 at 3:32 PM, Sean Owen <[email protected]> wrote: >>>>> Do you not mean ds.foreachPartition(_.foreach(println)) or similar? >>>>> >>>>> On Tue, Jul 5, 2016 at 2:22 PM, Jacek Laskowski <[email protected]> wrote: >>>>>> Hi, >>>>>> >>>>>> It's with the master built today. Why can't I call >>>>>> ds.foreachPartition(println)? Is using type annotation the only way to >>>>>> go forward? I'd be so sad if that's the case. >>>>>> >>>>>> scala> ds.foreachPartition(println) >>>>>> <console>:28: error: overloaded method value foreachPartition with >>>>>> alternatives: >>>>>> (func: >>>>>> org.apache.spark.api.java.function.ForeachPartitionFunction[Record])Unit >>>>>> <and> >>>>>> (f: Iterator[Record] => Unit)Unit >>>>>> cannot be applied to (Unit) >>>>>> ds.foreachPartition(println) >>>>>> ^ >>>>>> >>>>>> scala> sc.version >>>>>> res9: String = 2.0.0-SNAPSHOT >>>>>> >>>>>> Pozdrawiam, >>>>>> Jacek Laskowski >>>>>> ---- >>>>>> https://medium.com/@jaceklaskowski/ >>>>>> Mastering Apache Spark http://bit.ly/mastering-apache-spark >>>>>> Follow me at https://twitter.com/jaceklaskowski >>>>>> >>>>>> --------------------------------------------------------------------- >>>>>> To unsubscribe e-mail: [email protected] >>>>>> --------------------------------------------------------------------- To unsubscribe e-mail: [email protected]
