from:"SNEHASISH DUTTA"

check is empty effieciently

2019-06-26 Thread SNEHASISH DUTTA

Hi, which is more efficient? this is already defined since 2.4.0 *def isEmpty: Boolean = withAction("isEmpty", limit(1).groupBy().count().queryExecution) { plan => plan.executeCollect().head.getLong(0) == 0}* or * df.head(1).isEmpty* I am checking if a DF is empty and it is taking forever

Generic Dataset[T] Query

2019-05-09 Thread SNEHASISH DUTTA

Hi , I am trying to write a generic method which will return custom type datasets as well as spark.sql.Row def read[T](params: Map[String, Any])(implicit encoder: Encoder[T]): Dataset[T] is my method signature, which is working fine for custom types but when I am trying to obtain a Dataset[Row]

Re: Handle Null Columns in Spark Structured Streaming Kafka

2019-04-30 Thread SNEHASISH DUTTA

/spark.apache.org/docs/2.1.0/api/java/org/apache/spark/sql/DataFrameNaFunctions.html >> >> On Mon, Apr 29, 2019 at 4:57 PM Shixiong(Ryan) Zhu < >> shixi...@databricks.com> wrote: >> >>> Hey Snehasish, >>> >>> Do you have a reproducer for this

Handle Null Columns in Spark Structured Streaming Kafka

2019-04-24 Thread SNEHASISH DUTTA

Hi, While writing to kafka using spark structured streaming , if all the values in certain column are Null it gets dropped Is there any way to override this , other than using na.fill functions Regards, Snehasish

Shuffling Data After Union and Write

2018-04-13 Thread SNEHASISH DUTTA

Hi, I am currently facing an issue , while performing union on three data fames say df1,df2,df3 once the operation is performed and I am trying to save the data , the data is getting shuffled so the ordering of data in df1,df2,df3 are not maintained. When I save the data as text/csv file the

Access Table with Spark Dataframe

2018-03-20 Thread SNEHASISH DUTTA

Hi, I am using Spark 2.2 , a table fetched from database contains a (.) dot in one of the column names. Whenever I am trying to select that particular column I am getting query analysis exception. I have tried creating a temporary table , using createOrReplaceTempView() and fetch the column's

CSV use case

2018-02-21 Thread SNEHASISH DUTTA

Hi, I am using spark 2.2 csv reader I have data in following format 123|123|"abc"||""|"xyz" Where || is null And "" is one blank character as per the requirement I was using option sep as pipe And option quote as "" Parsed the data and using regex I was able to fulfill all the mentioned

Re: Serialize a DataFrame with Vector values into text/csv file

2018-02-20 Thread SNEHASISH DUTTA

Hi Mina, Even text won't work you may try this df.coalesce(1).write.option("h eader","true").mode("overwrite").save("output",format=text) Else convert to an rdd and use saveAsTextFile Regards, Snehasish On Wed, Feb 21, 2018 at 3:38 AM, SNEHASISH DUTTA

Re: Serialize a DataFrame with Vector values into text/csv file

2018-02-20 Thread SNEHASISH DUTTA

alesce(1).write.option("header","true").mode("overwrite > ").csv("output") throws > > java.lang.UnsupportedOperationException: CSV data source does not support > struct<...> data type. > > > Regards, > Mina > > > >

Re: Serialize a DataFrame with Vector values into text/csv file

2018-02-20 Thread SNEHASISH DUTTA

Hi Mina, This might help df.coalesce(1).write.option("header","true").mode("overwrite").csv("output") Regards, Snehasish On Wed, Feb 21, 2018 at 1:53 AM, Mina Aslani wrote: > Hi, > > I would like to serialize a dataframe with vector values into a text/csv > in pyspark. >

Re: Can spark handle this scenario?

2018-02-17 Thread SNEHASISH DUTTA

Hi Lian, This could be the solution case class Symbol(symbol: String, sector: String) case class Tick(symbol: String, sector: String, open: Double, close: Double) // symbolDS is Dataset[Symbol], pullSymbolFromYahoo returns Dataset[Tick] symbolDs.map { k =>

check is empty effieciently

Generic Dataset[T] Query

Re: Handle Null Columns in Spark Structured Streaming Kafka

Handle Null Columns in Spark Structured Streaming Kafka

Shuffling Data After Union and Write

Access Table with Spark Dataframe

CSV use case

Re: Serialize a DataFrame with Vector values into text/csv file

Re: Serialize a DataFrame with Vector values into text/csv file

Re: Serialize a DataFrame with Vector values into text/csv file

Re: Can spark handle this scenario?

11 matches

Site Navigation

Mail list logo

Footer information