Re: Dataset - withColumn and withColumnRenamed that accept Column type

2018-07-17 Thread Tathagata Das
Yes. Yes you can.

On Tue, Jul 17, 2018 at 11:42 AM, Sathi Chowdhury  wrote:

> Hi,
> My question is about ability to integrate spark streaming with multiple
> clusters.Is it a supported use case. An example of that is that two topics
> owned by different group and they have their own kakka infra .
> Can i have two dataframes as a result of spark.readstream listening to
> different kafka clueters in the same spark screaming job?
> Any one has solved this usecase before?
>
>
> Thanks.
> Sathi
>


Re: Dataset - withColumn and withColumnRenamed that accept Column type

2018-07-17 Thread sathich
this may work
val df_post= listCustomCols
.foldLeft(df_pre){(tempDF, listValue) =>
  tempDF.withColumn(
listValue.name,
new Column(listValue.name.toString + funcUDF(listValue.name))
)

and outsource the renaming to an udf

or  you can rename the column of one of the datasets before join itself.



--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org



Re: Dataset - withColumn and withColumnRenamed that accept Column type

2018-07-17 Thread Sathi Chowdhury
Hi,My question is about ability to integrate spark streaming with multiple 
clusters.Is it a supported use case. An example of that is that two topics 
owned by different group and they have their own kakka infra .Can i have two 
dataframes as a result of spark.readstream listening to different kafka 
clueters in the same spark screaming job?Any one has solved this usecase 
before? 

Thanks.Sathi

Dataset - withColumn and withColumnRenamed that accept Column type

2018-07-13 Thread Nirav Patel
Is there a version of withColumn or withColumnRenamed that accept Column
instead of String? That way I can specify FQN in case when there is
duplicate column names.

I can Drop column based on Column type argument then why can't I rename
them based on same type argument.

Use case is, I have Dataframe with duplicate columns at end of the join.
Most of the time I drop duplicate but now I need to rename one of those
column. I can not do it because there is no API that . I can rename it
before the join but that is not preferred.


def
withColumn(colName: String, col: Column): DataFrame
Returns a new Dataset by adding a column or replacing the existing column
that has the same name.

def
withColumnRenamed(existingName: String, newName: String): DataFrame
Returns a new Dataset with a column renamed.



I think there should also be this one:

def
withColumnRenamed(existingName: *Column*, newName: *Column*): DataFrame
Returns a new Dataset with a column renamed.

--