Re: DataFrame defined within conditional IF ELSE statement

2016-09-18 Thread Silvio Fiorito
Oh, sorry it was supposed to be sys.error, not sys.err



From: Mich Talebzadeh 
Date: Sunday, September 18, 2016 at 5:23 PM
To: Silvio Fiorito 
Cc: "user @spark" 
Subject: Re: DataFrame defined within conditional IF ELSE statement

Thanks Silvio.

This is what I ended up with

val df = option match {
  case 1 => {
println("option = 1")
val df = spark.read.option("header", 
false).csv("hdfs://rhes564:9000/data/prices/prices.*")
val df2 = df.map(p => columns(p(0).toString.toInt,p(1).toString, 
p(2).toString,p(3).toString))
df2
  }
  case 2 => {
println("option = 2")
val df2 = 
spark.table("test.marketData").select('TIMECREATED,'SECURITY,'PRICE)
df2
  }
  case 3 => {
println("option = 3")
spark.table("test.marketDataParquet").select('TIMECREATED,'SECURITY,'PRICE)
df2
  }
  case _ => {
println ("No valid option provide")
sys.exit
  }
}

For one reason or other the following

case _ => sys.err(“no valid option provided”)

Threw error!




Dr Mich Talebzadeh



LinkedIn  
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw



http://talebzadehmich.wordpress.com



Disclaimer: Use it at your own risk. Any and all responsibility for any loss, 
damage or destruction of data or any other property which may arise from 
relying on this email's technical content is explicitly disclaimed. The author 
will in no case be liable for any monetary damages arising from such loss, 
damage or destruction.



On 18 September 2016 at 21:04, Silvio Fiorito 
mailto:silvio.fior...@granturing.com>> wrote:
Hi Mich,

That’s because df2 is only within scope in the if statements.

Try this:

val df = option match {
  case 1 => {
println("option = 1")
val df = spark.read.option("header", 
false).csv("hdfs://rhes564:9000/data/prices/prices.*")
val df2 = df.map(p => columns(p(0).toString.toInt,p(1).toString, 
p(2).toString,p(3).toString))
df2
  }
  case 2 => spark.table("test.marketData").select('TIMECREATED,'SECURITY,'PRICE)
  case 3 => 
spark.table("test.marketDataParquet").select('TIMECREATED,'SECURITY,'PRICE)
  case _ => sys.err(“no valid option provided”)
}

df.printSchema()


Thanks,
Silvio

From: Mich Talebzadeh 
mailto:mich.talebza...@gmail.com>>
Date: Saturday, September 17, 2016 at 4:18 PM
To: "user @spark" mailto:user@spark.apache.org>>
Subject: DataFrame defined within conditional IF ELSE statement

In Spark 2 this gives me an error in a conditional  IF ELSE statement

I recall seeing the same in standard SQL

I am doing a test for different sources (text file, ORC or Parquet) to be read 
in dependent on value of var option

I wrote this

import org.apache.spark.sql.functions._
import java.util.Calendar
import org.joda.time._
var option = 1
val today = new DateTime()
val minutes = -15
val  minutesago =  today.plusMinutes(minutes).toString.toString.substring(11,19)
val date = java.time.LocalDate.now.toString
val hour = java.time.LocalTime.now.toString
case class columns(INDEX: Int, TIMECREATED: String, SECURITY: String, PRICE: 
String)

if(option == 1 ) {
   println("option = 1")
   val df = spark.read.option("header", 
false).csv("hdfs://rhes564:9000/data/prices/prices.*")
   val df2 = df.map(p => columns(p(0).toString.toInt,p(1).toString, 
p(2).toString,p(3).toString))
   df2.printSchema
} else if (option == 2) {
val df2 = 
spark.table("test.marketData").select('TIMECREATED,'SECURITY,'PRICE)
} else if (option == 3) {
val df2 = 
spark.table("test.marketDataParquet").select('TIMECREATED,'SECURITY,'PRICE)
} else {
println("no valid option provided")
sys.exit(0)
}

With option 1 selected it goes through and shows this

option = 1
root
 |-- INDEX: integer (nullable = true)
 |-- TIMECREATED: string (nullable = true)
 |-- SECURITY: string (nullable = true)
 |-- PRICE: string (nullable = true)

But when I try to do df2.printSchema OUTSEDE of the LOOP, it comes back with 
error

scala> df2.printSchema
:31: error: not found: value df2
   df2.printSchema
   ^
I can define a stud df2 before IF ELSE statement. Is that the best way of 
dealing with it?

Thanks


Dr Mich Talebzadeh



LinkedIn  
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw



http://talebzadehmich.wordpress.com



Disclaimer: Use it at your own risk. Any and all responsibility for any loss, 
damage or destruction of data or any other property which may arise from 
relying on this email's technical content is explicitly disclaimed. The author 
will in no case be liable for any monetary damages arising from such loss, 
damage or destruction.





Re: DataFrame defined within conditional IF ELSE statement

2016-09-18 Thread Mich Talebzadeh
Thanks Silvio.

This is what I ended up with

val df = option match {
  case 1 => {
println("option = 1")
val df = spark.read.option("header",
false).csv("hdfs://rhes564:9000/data/prices/prices.*")
val df2 = df.map(p => columns(p(0).toString.toInt,p(1).toString,
p(2).toString,p(3).toString))
df2
  }
  case 2 => {
println("option = 2")
val df2 =
spark.table("test.marketData").select('TIMECREATED,'SECURITY,'PRICE)
df2
  }
  case 3 => {
println("option = 3")

spark.table("test.marketDataParquet").select('TIMECREATED,'SECURITY,'PRICE)
df2
  }
  case _ => {
println ("No valid option provide")
sys.exit
  }
}

For one reason or other the following

case _ => sys.err(“no valid option provided”)


Threw error!




Dr Mich Talebzadeh



LinkedIn * 
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*



http://talebzadehmich.wordpress.com


*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.



On 18 September 2016 at 21:04, Silvio Fiorito  wrote:

> Hi Mich,
>
>
>
> That’s because df2 is only within scope in the if statements.
>
>
>
> Try this:
>
>
>
> val df = option match {
>
>   case 1 => {
>
> println("option = 1")
>
> val df = spark.read.option("header", false).csv("hdfs://rhes564:
> 9000/data/prices/prices.*")
>
> val df2 = df.map(p => columns(p(0).toString.toInt,p(1).toString,
> p(2).toString,p(3).toString))
>
> df2
>
>   }
>
>   case 2 => spark.table("test.marketData").select('TIMECREATED,'
> SECURITY,'PRICE)
>
>   case 3 => spark.table("test.marketDataParquet").select('
> TIMECREATED,'SECURITY,'PRICE)
>
>   case _ => sys.err(“no valid option provided”)
>
> }
>
>
>
> df.printSchema()
>
>
>
>
>
> Thanks,
>
> Silvio
>
>
>
> *From: *Mich Talebzadeh 
> *Date: *Saturday, September 17, 2016 at 4:18 PM
> *To: *"user @spark" 
> *Subject: *DataFrame defined within conditional IF ELSE statement
>
>
>
> In Spark 2 this gives me an error in a conditional  IF ELSE statement
>
>
>
> I recall seeing the same in standard SQL
>
>
>
> I am doing a test for different sources (text file, ORC or Parquet) to be
> read in dependent on value of var option
>
>
>
> I wrote this
>
>
>
> import org.apache.spark.sql.functions._
> import java.util.Calendar
> import org.joda.time._
> var option = 1
> val today = new DateTime()
> val minutes = -15
> val  minutesago =  today.plusMinutes(minutes).toString.toString.substring(
> 11,19)
> val date = java.time.LocalDate.now.toString
> val hour = java.time.LocalTime.now.toString
> case class columns(INDEX: Int, TIMECREATED: String, SECURITY: String,
> PRICE: String)
>
>
> *if(option == 1 ) {*
>
>
>
>
>
>
>
>
>
>
>
> *println("option = 1")val df = spark.read.option("header",
> false).csv("hdfs://rhes564:9000/data/prices/prices.*")val df2 =
> df.map(p => columns(p(0).toString.toInt,p(1).toString,
> p(2).toString,p(3).toString))df2.printSchema } else if (option == 2) {
> val df2 =
> spark.table("test.marketData").select('TIMECREATED,'SECURITY,'PRICE) } else
> if (option == 3) { val df2 =
> spark.table("test.marketDataParquet").select('TIMECREATED,'SECURITY,'PRICE)
> } else { println("no valid option provided") sys.exit(0) }*
>
>
>
> With option 1 selected it goes through and shows this
>
>
>
> option = 1
> root
>  |-- INDEX: integer (nullable = true)
>  |-- TIMECREATED: string (nullable = true)
>  |-- SECURITY: string (nullable = true)
>  |-- PRICE: string (nullable = true)
>
>
>
> But when I try to do df2.printSchema OUTSEDE of the LOOP, it comes back
> with error
>
>
> scala> df2.printSchema
> :31: error: not found: value df2
>df2.printSchema
>^
>
> I can define a stud df2 before IF ELSE statement. Is that the best way of
> dealing with it?
>
>
>
> Thanks
>
>
> Dr Mich Talebzadeh
>
>
>
> LinkedIn  
> *https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>
>
>
> http://talebzadehmich.wordpress.com
>
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>


Re: DataFrame defined within conditional IF ELSE statement

2016-09-18 Thread Silvio Fiorito
Hi Mich,

That’s because df2 is only within scope in the if statements.

Try this:

val df = option match {
  case 1 => {
println("option = 1")
val df = spark.read.option("header", 
false).csv("hdfs://rhes564:9000/data/prices/prices.*")
val df2 = df.map(p => columns(p(0).toString.toInt,p(1).toString, 
p(2).toString,p(3).toString))
df2
  }
  case 2 => spark.table("test.marketData").select('TIMECREATED,'SECURITY,'PRICE)
  case 3 => 
spark.table("test.marketDataParquet").select('TIMECREATED,'SECURITY,'PRICE)
  case _ => sys.err(“no valid option provided”)
}

df.printSchema()


Thanks,
Silvio

From: Mich Talebzadeh 
Date: Saturday, September 17, 2016 at 4:18 PM
To: "user @spark" 
Subject: DataFrame defined within conditional IF ELSE statement

In Spark 2 this gives me an error in a conditional  IF ELSE statement

I recall seeing the same in standard SQL

I am doing a test for different sources (text file, ORC or Parquet) to be read 
in dependent on value of var option

I wrote this

import org.apache.spark.sql.functions._
import java.util.Calendar
import org.joda.time._
var option = 1
val today = new DateTime()
val minutes = -15
val  minutesago =  today.plusMinutes(minutes).toString.toString.substring(11,19)
val date = java.time.LocalDate.now.toString
val hour = java.time.LocalTime.now.toString
case class columns(INDEX: Int, TIMECREATED: String, SECURITY: String, PRICE: 
String)

if(option == 1 ) {
   println("option = 1")
   val df = spark.read.option("header", 
false).csv("hdfs://rhes564:9000/data/prices/prices.*")
   val df2 = df.map(p => columns(p(0).toString.toInt,p(1).toString, 
p(2).toString,p(3).toString))
   df2.printSchema
} else if (option == 2) {
val df2 = 
spark.table("test.marketData").select('TIMECREATED,'SECURITY,'PRICE)
} else if (option == 3) {
val df2 = 
spark.table("test.marketDataParquet").select('TIMECREATED,'SECURITY,'PRICE)
} else {
println("no valid option provided")
sys.exit(0)
}

With option 1 selected it goes through and shows this

option = 1
root
 |-- INDEX: integer (nullable = true)
 |-- TIMECREATED: string (nullable = true)
 |-- SECURITY: string (nullable = true)
 |-- PRICE: string (nullable = true)

But when I try to do df2.printSchema OUTSEDE of the LOOP, it comes back with 
error

scala> df2.printSchema
:31: error: not found: value df2
   df2.printSchema
   ^
I can define a stud df2 before IF ELSE statement. Is that the best way of 
dealing with it?

Thanks


Dr Mich Talebzadeh



LinkedIn  
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw



http://talebzadehmich.wordpress.com



Disclaimer: Use it at your own risk. Any and all responsibility for any loss, 
damage or destruction of data or any other property which may arise from 
relying on this email's technical content is explicitly disclaimed. The author 
will in no case be liable for any monetary damages arising from such loss, 
damage or destruction.




Re: DataFrame defined within conditional IF ELSE statement

2016-09-18 Thread Mich Talebzadeh
any opinion on this please?

Dr Mich Talebzadeh



LinkedIn * 
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
*



http://talebzadehmich.wordpress.com


*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.



On 17 September 2016 at 21:18, Mich Talebzadeh 
wrote:

> In Spark 2 this gives me an error in a conditional  IF ELSE statement
>
> I recall seeing the same in standard SQL
>
> I am doing a test for different sources (text file, ORC or Parquet) to be
> read in dependent on value of var option
>
> I wrote this
>
> import org.apache.spark.sql.functions._
> import java.util.Calendar
> import org.joda.time._
> var option = 1
> val today = new DateTime()
> val minutes = -15
> val  minutesago =  today.plusMinutes(minutes).toString.toString.substring(
> 11,19)
> val date = java.time.LocalDate.now.toString
> val hour = java.time.LocalTime.now.toString
> case class columns(INDEX: Int, TIMECREATED: String, SECURITY: String,
> PRICE: String)
>
>
>
>
>
>
>
>
>
>
>
>
>
> *if(option == 1 ) {   println("option = 1")   val df =
> spark.read.option("header",
> false).csv("hdfs://rhes564:9000/data/prices/prices.*")   val df2 = df.map(p
> => columns(p(0).toString.toInt,p(1).toString,
> p(2).toString,p(3).toString))   df2.printSchema} else if (option == 2) {
> val df2 =
> spark.table("test.marketData").select('TIMECREATED,'SECURITY,'PRICE)} else
> if (option == 3) {val df2 =
> spark.table("test.marketDataParquet").select('TIMECREATED,'SECURITY,'PRICE)}
> else {println("no valid option provided")sys.exit(0)}*
>
> With option 1 selected it goes through and shows this
>
> option = 1
> root
>  |-- INDEX: integer (nullable = true)
>  |-- TIMECREATED: string (nullable = true)
>  |-- SECURITY: string (nullable = true)
>  |-- PRICE: string (nullable = true)
>
> But when I try to do df2.printSchema OUTSEDE of the LOOP, it comes back
> with error
>
> scala> df2.printSchema
> :31: error: not found: value df2
>df2.printSchema
>^
>
> I can define a stud df2 before IF ELSE statement. Is that the best way of
> dealing with it?
>
> Thanks
>
> Dr Mich Talebzadeh
>
>
>
> LinkedIn * 
> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> *
>
>
>
> http://talebzadehmich.wordpress.com
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>


DataFrame defined within conditional IF ELSE statement

2016-09-17 Thread Mich Talebzadeh
In Spark 2 this gives me an error in a conditional  IF ELSE statement

I recall seeing the same in standard SQL

I am doing a test for different sources (text file, ORC or Parquet) to be
read in dependent on value of var option

I wrote this

import org.apache.spark.sql.functions._
import java.util.Calendar
import org.joda.time._
var option = 1
val today = new DateTime()
val minutes = -15
val  minutesago =
today.plusMinutes(minutes).toString.toString.substring(11,19)
val date = java.time.LocalDate.now.toString
val hour = java.time.LocalTime.now.toString
case class columns(INDEX: Int, TIMECREATED: String, SECURITY: String,
PRICE: String)













*if(option == 1 ) {   println("option = 1")   val df =
spark.read.option("header",
false).csv("hdfs://rhes564:9000/data/prices/prices.*")   val df2 = df.map(p
=> columns(p(0).toString.toInt,p(1).toString,
p(2).toString,p(3).toString))   df2.printSchema} else if (option == 2) {
val df2 =
spark.table("test.marketData").select('TIMECREATED,'SECURITY,'PRICE)} else
if (option == 3) {val df2 =
spark.table("test.marketDataParquet").select('TIMECREATED,'SECURITY,'PRICE)}
else {println("no valid option provided")sys.exit(0)}*

With option 1 selected it goes through and shows this

option = 1
root
 |-- INDEX: integer (nullable = true)
 |-- TIMECREATED: string (nullable = true)
 |-- SECURITY: string (nullable = true)
 |-- PRICE: string (nullable = true)

But when I try to do df2.printSchema OUTSEDE of the LOOP, it comes back
with error

scala> df2.printSchema
:31: error: not found: value df2
   df2.printSchema
   ^

I can define a stud df2 before IF ELSE statement. Is that the best way of
dealing with it?

Thanks

Dr Mich Talebzadeh



LinkedIn * 
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
*



http://talebzadehmich.wordpress.com


*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.