>From the online docs: https://spark.apache.org/docs/1.6.2/api/java/org/apache/spark/sql/Row.html#apply(int)
java.lang.Object apply(int i) Returns the value at position i. If the value is null, null is returned. The following is a mapping between Spark SQL types and return types: So its returning the content of the first element in the row, in this case the Array (of length 1) of Date types. Nicholas Szandor Hakobian Data Scientist Rally Health nicholas.hakob...@rallyhealth.com M: 510-295-7113 On Sat, Jul 30, 2016 at 11:41 PM, Mich Talebzadeh <mich.talebza...@gmail.com > wrote: > thanks gents. > > I am trying to understand this better. > > As I understand a DataFrame is basically an equivalent table in relational > term. > > so > > scala> var maxdate = > ll_18740868.filter(col("transactiondescription").contains(HASHTAG)).agg(max("transactiondate")) > maxdate: org.apache.spark.sql.DataFrame = [max(transactiondate): date] > > So I find the max(transactiondate) for the filter I have applied. In sql > term --> select max(transactiondate) from ll_18740868 where > transactiondescription like "%HASHTAG%" > > Now I want to store it in a single variable and get it worked out > > scala> var maxdate = > ll_18740868.filter(col("transactiondescription").contains(HASHTAG)).agg(max("transactiondate")).collect > maxdate: Array[org.apache.spark.sql.Row] = Array([2015-12-15]) > > Now I have the value stored in a row. I get it as follows. It is the > first column of the row (actually the only column) and in date format > > scala> var maxdate = > ll_18740868.filter(col("transactiondescription").contains(HASHTAG)).agg(max("transactiondate")).collect. > apply(0).getDate(0) > maxdate: java.sql.Date = 2015-12-15 > > what is the role of apply(0) here? > > Thanks > > > > > > Dr Mich Talebzadeh > > > > LinkedIn * > https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw > <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* > > > > http://talebzadehmich.wordpress.com > > > *Disclaimer:* Use it at your own risk. Any and all responsibility for any > loss, damage or destruction of data or any other property which may arise > from relying on this email's technical content is explicitly disclaimed. > The author will in no case be liable for any monetary damages arising from > such loss, damage or destruction. > > > > On 31 July 2016 at 03:28, Xinh Huynh <xinh.hu...@gmail.com> wrote: > >> Hi Mitch, >> >> I think you were missing a step: >> [your result] maxdate: org.apache.spark.sql.Row = [2015-12-15] >> Since maxdate is of type Row, you would want to extract the first column >> of the Row with: >> >> >> val maxdateStr = maxdate.getString(0) >> >> assuming the column type is String. >> API doc is here: >> http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.sql.Row >> >> Then you can do the query: >> >> >> col("transactiondate") === maxdateStr >> >> Xinh >> >> On Sat, Jul 30, 2016 at 5:20 PM, ayan guha <guha.a...@gmail.com> wrote: >> >>> select * >>> from (select *, >>> rank() over (order by transactiondate) r >>> from ll_18740868 where transactiondescription='XYZ' >>> ) inner >>> where r=1 >>> >>> Hi Mitch, >>> >>> If using SQL is fine, you can try the code above. You need to register >>> ll_18740868 as temp table. >>> >>> On Sun, Jul 31, 2016 at 6:49 AM, Mich Talebzadeh < >>> mich.talebza...@gmail.com> wrote: >>> >>>> >>>> Hi, >>>> >>>> I would like to find out when it was the last time I paid a company >>>> with Debit Card >>>> >>>> >>>> This is the way I do it. >>>> >>>> 1) Find the date when I paid last >>>> 2) Find the rest of details from the row(s) >>>> >>>> So >>>> >>>> var HASHTAG = "XYZ" >>>> scala> var maxdate = >>>> ll_18740868.filter(col("transactiondescription").contains(HASHTAG)).agg(max("transactiondate")).collect.apply(0) >>>> maxdate: org.apache.spark.sql.Row = [2015-12-15] >>>> >>>> OK so it was 2015-12-15 >>>> >>>> >>>> Now I want to get the rest of the columns. This one works when I hard >>>> code the maxdate! >>>> >>>> >>>> scala> ll_18740868.filter(col("transactiondescription").contains(HASHTAG) >>>> && col("transactiondate") === "2015-12-15").select("transactiondate", >>>> "transactiondescription", "debitamount").show >>>> +---------------+----------------------+-----------+ >>>> |transactiondate|transactiondescription|debitamount| >>>> +---------------+----------------------+-----------+ >>>> | 2015-12-15| XYZ LTD CD 4636 | 10.95| >>>> +---------------+----------------------+-----------+ >>>> >>>> Now if I want to use the var maxdate in place of "2015-12-15", how >>>> would I do that? >>>> >>>> I tried lit(maxdate) etc but they are all giving me error? >>>> >>>> java.lang.RuntimeException: Unsupported literal type class >>>> org.apache.spark.sql.catalyst.expressions.GenericRowWithSchema >>>> [2015-12-15] >>>> >>>> >>>> Thanks >>>> >>> >>> >>> >>> -- >>> Best Regards, >>> Ayan Guha >>> >> >> >