Thanks Michael. Great d.filter(col("id") === lit(m)).show
BTW where all these methods like lit etc are documented. Also I guess any action call like apply(0) or getInt(0) refers to the "current" parameter? Regards On 26 February 2016 at 09:42, Michał Zieliński <zielinski.mich...@gmail.com> wrote: > You need to collect the value. > > val m: Int = d.agg(max($"id")).collect.apply(0).getInt(0) > d.filter(col("id") === lit(m)) > > On 26 February 2016 at 09:41, Mich Talebzadeh <mich.talebza...@gmail.com> > wrote: > >> Can this be done using DFs? >> >> >> >> scala> val HiveContext = new org.apache.spark.sql.hive.HiveContext(sc) >> >> scala> val d = HiveContext.table("test.dummy") >> d: org.apache.spark.sql.DataFrame = [id: int, clustered: int, scattered: >> int, randomised: int, random_string: string, small_vc: string, padding: >> string] >> >> scala> var m = d.agg(max($"id")) >> m: org.apache.spark.sql.DataFrame = [max(id): int] >> >> How can I join these two? In other words I want to get all rows with id = >> m here? >> >> d.filter($"id" = m) ? >> >> Thanks >> >> On 25/02/2016 22:58, Mohammad Tariq wrote: >> >> AFAIK, this isn't supported yet. A ticket >> <https://issues.apache.org/jira/browse/SPARK-4226> is in progress though. >> >> >> >> [image: http://] <http://about.me/mti> >> >> Tariq, Mohammad >> about.me/mti >> [image: http://] >> >> >> >> On Fri, Feb 26, 2016 at 4:16 AM, Mich Talebzadeh < >> mich.talebza...@cloudtechnologypartners.co.uk> wrote: >> >>> >>> >>> Hi, >>> >>> >>> >>> I guess the following confirms that Spark does bot support sub-queries >>> >>> >>> >>> val d = HiveContext.table("test.dummy") >>> >>> d.registerTempTable("tmp") >>> >>> HiveContext.sql("select * from tmp where id IN (select max(id) from >>> tmp)") >>> >>> It crashes >>> >>> The SQL works OK in Hive itself on the underlying table! >>> >>> select * from dummy where id IN (select max(id) from dummy); >>> >>> >>> >>> Thanks >>> >> >