Re: Spark Dataset doesn't have api for changing columns

2016-01-23 Thread Milad khajavi
How can I request for this API?
See this closed issue: https://issues.apache.org/jira/browse/SPARK-12863

On Tue, Jan 19, 2016 at 10:12 PM, Michael Armbrust 
wrote:

> In Spark 2.0 we are planning to combine DataFrame and Dataset so that all
> the methods will be available on either class.
>
> On Tue, Jan 19, 2016 at 3:42 AM, Milad khajavi  wrote:
>
>> Hi Spark users,
>>
>> when I want to map the result of count on groupBy, I need to convert the
>> result to Dataframe, then change the column names and map the result to new
>> case class, Why Spark Datatset API doesn't have direct functionality?
>>
>> case class LogRow(id: String, location: String, time: Long)
>> case class KeyValue(key: (String, String), value: Long)
>>
>> val log = LogRow("1", "a", 1) :: LogRow("1", "a", 2) :: LogRow("1", "b",
>> 3) :: LogRow("1", "a", 4) :: LogRow("1", "b", 5) :: LogRow("1", "b", 6) ::
>> LogRow("1", "c", 7) :: LogRow("2", "a", 1) :: LogRow("2", "b", 2) ::
>> LogRow("2", "b", 3) :: LogRow("2", "a", 4) :: LogRow("2", "a", 5) ::
>> LogRow("2", "a", 6) :: LogRow("2", "c", 7) :: Nil
>> log.toDS().groupBy(l => {
>>   (l.id, l.location)
>> }).count().toDF().toDF("key", "value").as[KeyValue].show
>>
>> +-+-+
>> |  key|value|
>> +-+-+
>> |[1,a]|3|
>> |[1,b]|3|
>> |[1,c]|1|
>> |[2,a]|4|
>> |[2,b]|2|
>> |[2,c]|1|
>> +-+-+
>>
>>
>> --
>> Milād Khājavi
>> http://blog.khajavi.ir
>> Having the source means you can do it yourself.
>> I tried to change the world, but I couldn’t find the source code.
>>
>
>


-- 
Milād Khājavi
http://blog.khajavi.ir
Having the source means you can do it yourself.
I tried to change the world, but I couldn’t find the source code.


Spark Dataset doesn't have api for changing columns

2016-01-19 Thread Milad khajavi
Hi Spark users,

when I want to map the result of count on groupBy, I need to convert the
result to Dataframe, then change the column names and map the result to new
case class, Why Spark Datatset API doesn't have direct functionality?

case class LogRow(id: String, location: String, time: Long)
case class KeyValue(key: (String, String), value: Long)

val log = LogRow("1", "a", 1) :: LogRow("1", "a", 2) :: LogRow("1", "b", 3)
:: LogRow("1", "a", 4) :: LogRow("1", "b", 5) :: LogRow("1", "b", 6) ::
LogRow("1", "c", 7) :: LogRow("2", "a", 1) :: LogRow("2", "b", 2) ::
LogRow("2", "b", 3) :: LogRow("2", "a", 4) :: LogRow("2", "a", 5) ::
LogRow("2", "a", 6) :: LogRow("2", "c", 7) :: Nil
log.toDS().groupBy(l => {
  (l.id, l.location)
}).count().toDF().toDF("key", "value").as[KeyValue].show

+-+-+
|  key|value|
+-+-+
|[1,a]|3|
|[1,b]|3|
|[1,c]|1|
|[2,a]|4|
|[2,b]|2|
|[2,c]|1|
+-+-+


-- 
Milād Khājavi
http://blog.khajavi.ir
Having the source means you can do it yourself.
I tried to change the world, but I couldn’t find the source code.


Re: Spark Dataset doesn't have api for changing columns

2016-01-19 Thread Michael Armbrust
In Spark 2.0 we are planning to combine DataFrame and Dataset so that all
the methods will be available on either class.

On Tue, Jan 19, 2016 at 3:42 AM, Milad khajavi  wrote:

> Hi Spark users,
>
> when I want to map the result of count on groupBy, I need to convert the
> result to Dataframe, then change the column names and map the result to new
> case class, Why Spark Datatset API doesn't have direct functionality?
>
> case class LogRow(id: String, location: String, time: Long)
> case class KeyValue(key: (String, String), value: Long)
>
> val log = LogRow("1", "a", 1) :: LogRow("1", "a", 2) :: LogRow("1", "b",
> 3) :: LogRow("1", "a", 4) :: LogRow("1", "b", 5) :: LogRow("1", "b", 6) ::
> LogRow("1", "c", 7) :: LogRow("2", "a", 1) :: LogRow("2", "b", 2) ::
> LogRow("2", "b", 3) :: LogRow("2", "a", 4) :: LogRow("2", "a", 5) ::
> LogRow("2", "a", 6) :: LogRow("2", "c", 7) :: Nil
> log.toDS().groupBy(l => {
>   (l.id, l.location)
> }).count().toDF().toDF("key", "value").as[KeyValue].show
>
> +-+-+
> |  key|value|
> +-+-+
> |[1,a]|3|
> |[1,b]|3|
> |[1,c]|1|
> |[2,a]|4|
> |[2,b]|2|
> |[2,c]|1|
> +-+-+
>
>
> --
> Milād Khājavi
> http://blog.khajavi.ir
> Having the source means you can do it yourself.
> I tried to change the world, but I couldn’t find the source code.
>