t;>>> hiveContext.sql(select MyUDF("test") from myTable);
>>>>
>>>> My hiveContext.sql() query involves group by on multiple columns so for
>>>> scaling purpose I am trying to convert this query into DataFrame APIs
>>>>
>>>>
DataFrame. How do we use it? I know how
>>> to
>>> use it in a sql and it works fine
>>>
>>> hiveContext.sql(select MyUDF("test") from myTable);
>>>
>>> My hiveContext.sql() query involves group by on multiple columns so for
>>> scaling purpose I
>>
>>
>> dataframe.select("col1","col2","coln").groupby(""col1","col2","coln").count();
>>
>> Can we do the follwing dataframe.select(MyUDF("col1"))??? Please guide.
>>
>>
>>
>>
ot;coln").count();
>
> Can we do the follwing dataframe.select(MyUDF("col1"))??? Please guide.
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/How-to-use-registered-Hive-UDF-in-Spark-DataFrame-tp24907.html
>
groupby(""col1","col2","coln").count();
Can we do the follwing dataframe.select(MyUDF("col1"))??? Please guide.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/How-to-use-registered-Hive-UDF-in-Spark