RE: Spark 1.6.0: substring on df.select

2016-05-12 Thread Bharathi Raja
Thanks Raghav. 

I have 5+ million records. I feel creating multiple come is not an optimal way.

Please suggest any other alternate solution.
Can’t we do any string operation in DF.Select?

Regards,
Raja

From: Raghavendra Pandey
Sent: 11 May 2016 09:04 PM
To: Bharathi Raja
Cc: User
Subject: Re: Spark 1.6.0: substring on df.select

You can create a column with count of /.  Then take max of it and create that 
many columns for every row with null fillers. 
Raghav 
On 11 May 2016 20:37, "Bharathi Raja"  wrote:
Hi,
 
I have a dataframe column col1 with values something like 
“/client/service/version/method”. The number of “/” are not constant. 
Could you please help me to extract all methods from the column col1?
 
In Pig i used SUBSTRING with LAST_INDEX_OF(“/”).
 
Thanks in advance.
Regards,
Raja



Spark 1.6.0: substring on df.select

2016-05-11 Thread Bharathi Raja
Hi,

I have a dataframe column col1 with values something like 
“/client/service/version/method”. The number of “/” are not constant. 
Could you please help me to extract all methods from the column col1?

In Pig i used SUBSTRING with LAST_INDEX_OF(“/”).

Thanks in advance.
Regards,
Raja


RE: How to Parse & flatten JSON object in a text file using Spark&Scala into Dataframe

2015-12-24 Thread Bharathi Raja
Thanks Eran, I'll check the solution.

Regards,
Raja

-Original Message-
From: "Eran Witkon" 
Sent: ‎12/‎24/‎2015 4:07 PM
To: "Bharathi Raja" ; "Gokula Krishnan D" 

Cc: "user@spark.apache.org" 
Subject: Re: How to Parse & flatten JSON object in a text file using 
Spark&Scala into Dataframe

raja! I found the answer to your question! 
Look at 
http://stackoverflow.com/questions/34069282/how-to-query-json-data-column-using-spark-dataframes
this is what you (and I) was looking for.
general idea - you read the list as text where project Details is just a string 
field and then you build the JSON string representation of the whole line and 
you have a nested JSON schema which SparkSQL can read.


Eran


On Thu, Dec 24, 2015 at 10:26 AM Eran Witkon  wrote:

I don't have the exact answer for you but I would look for something using 
explode method on DataFrame  


On Thu, Dec 24, 2015 at 7:34 AM Bharathi Raja  wrote:

Thanks Gokul, but the file I have had the same format as I have mentioned. 
First two columns are not in Json format.

Thanks,
Raja


From: Gokula Krishnan D
Sent: ‎12/‎24/‎2015 2:44 AM
To: Eran Witkon
Cc: raja kbv; user@spark.apache.org

Subject: Re: How to Parse & flatten JSON object in a text file using Spark 
&Scala into Dataframe


You can try this .. But slightly modified the  input structure since first two 
columns were not in Json format. 






Thanks & Regards, 
Gokula Krishnan (Gokul)


On Wed, Dec 23, 2015 at 9:46 AM, Eran Witkon  wrote:

Did you get a solution for this?


On Tue, 22 Dec 2015 at 20:24 raja kbv  wrote:

Hi,


I am new to spark.


I have a text file with below structure.


 
(employeeID: Int, Name: String, ProjectDetails: JsonObject{[{ProjectName, 
Description, Duriation, Role}]})
Eg:
(123456, Employee1, {“ProjectDetails”:[
 { “ProjectName”: “Web 
Develoement”, “Description” : “Online Sales website”, “Duration” : “6 Months” , 
“Role” : “Developer”}
 { “ProjectName”: 
“Spark Develoement”, “Description” : “Online Sales Analysis”, “Duration” : “6 
Months” , “Role” : “Data Engineer”}
 { “ProjectName”: 
“Scala Training”, “Description” : “Training”, “Duration” : “1 Month” }
  ]
}
 
 
Could someone help me to parse & flatten the record as below dataframe using 
scala?
 
employeeID,Name, ProjectName, Description, Duration, Role
123456, Employee1, Web Develoement, Online Sales website, 6 Months , Developer
123456, Employee1, Spark Develoement, Online Sales Analysis, 6 Months, Data 
Engineer
123456, Employee1, Scala Training, Training, 1 Month, null
 


Thank you in advance.


Regards,
Raja

How to ignore case in dataframe groupby?

2015-12-24 Thread Bharathi Raja
Hi,
Values in a dataframe column named countrycode are in different cases. Eg: (US, 
us).  groupBy & count gives two rows but the requirement is to ignore case for 
this operation.
1) Is there a way to ignore case in groupBy? Or
2) Is there a way to update the dataframe column countrycode to uppercase?

Thanks in advance.

Regards,
Raja

RE: How to Parse & flatten JSON object in a text file using Spark &Scala into Dataframe

2015-12-23 Thread Bharathi Raja
Thanks Gokul, but the file I have had the same format as I have mentioned. 
First two columns are not in Json format.

Thanks,
Raja

-Original Message-
From: "Gokula Krishnan D" 
Sent: ‎12/‎24/‎2015 2:44 AM
To: "Eran Witkon" 
Cc: "raja kbv" ; "user@spark.apache.org" 

Subject: Re: How to Parse & flatten JSON object in a text file using Spark 
&Scala into Dataframe

You can try this .. But slightly modified the  input structure since first two 
columns were not in Json format. 






Thanks & Regards, 
Gokula Krishnan (Gokul)


On Wed, Dec 23, 2015 at 9:46 AM, Eran Witkon  wrote:

Did you get a solution for this?


On Tue, 22 Dec 2015 at 20:24 raja kbv  wrote:

Hi,


I am new to spark.


I have a text file with below structure.


 
(employeeID: Int, Name: String, ProjectDetails: JsonObject{[{ProjectName, 
Description, Duriation, Role}]})
Eg:
(123456, Employee1, {“ProjectDetails”:[
 { “ProjectName”: “Web 
Develoement”, “Description” : “Online Sales website”, “Duration” : “6 Months” , 
“Role” : “Developer”}
 { “ProjectName”: 
“Spark Develoement”, “Description” : “Online Sales Analysis”, “Duration” : “6 
Months” , “Role” : “Data Engineer”}
 { “ProjectName”: 
“Scala Training”, “Description” : “Training”, “Duration” : “1 Month” }
  ]
}
 
 
Could someone help me to parse & flatten the record as below dataframe using 
scala?
 
employeeID,Name, ProjectName, Description, Duration, Role
123456, Employee1, Web Develoement, Online Sales website, 6 Months , Developer
123456, Employee1, Spark Develoement, Online Sales Analysis, 6 Months, Data 
Engineer
123456, Employee1, Scala Training, Training, 1 Month, null
 


Thank you in advance.


Regards,
Raja
-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

RE: How to Parse & flatten JSON object in a text file using Spark &Scala into Dataframe

2015-12-23 Thread Bharathi Raja
Hi Eran, I didn't get the solution yet. 

Thanks,
Raja

-Original Message-
From: "Eran Witkon" 
Sent: ‎12/‎23/‎2015 8:17 PM
To: "raja kbv" ; "user@spark.apache.org" 

Subject: Re: How to Parse & flatten JSON object in a text file using Spark 
&Scala into Dataframe

Did you get a solution for this?

On Tue, 22 Dec 2015 at 20:24 raja kbv  wrote:

Hi,


I am new to spark.


I have a text file with below structure.


 
(employeeID: Int, Name: String, ProjectDetails: JsonObject{[{ProjectName, 
Description, Duriation, Role}]})
Eg:
(123456, Employee1, {“ProjectDetails”:[
 { “ProjectName”: “Web 
Develoement”, “Description” : “Online Sales website”, “Duration” : “6 Months” , 
“Role” : “Developer”}
 { “ProjectName”: 
“Spark Develoement”, “Description” : “Online Sales Analysis”, “Duration” : “6 
Months” , “Role” : “Data Engineer”}
 { “ProjectName”: 
“Scala Training”, “Description” : “Training”, “Duration” : “1 Month” }
  ]
}
 
 
Could someone help me to parse & flatten the record as below dataframe using 
scala?
 
employeeID,Name, ProjectName, Description, Duration, Role
123456, Employee1, Web Develoement, Online Sales website, 6 Months , Developer
123456, Employee1, Spark Develoement, Online Sales Analysis, 6 Months, Data 
Engineer
123456, Employee1, Scala Training, Training, 1 Month, null
 


Thank you in advance.


Regards,
Raja