Thanks Michael as I gathered for now it is a feature. On Monday, 25 April 2016, 18:36, Michael Armbrust <mich...@databricks.com> wrote:
When you define a class inside of a method, it implicitly has a pointer to the outer scope of the method. Spark doesn't have access to this scope, so this makes it hard (impossible?) for us to construct new instances of that class. So, define your classes that you plan to use with Spark at the top level. On Mon, Apr 25, 2016 at 9:36 AM, Mich Talebzadeh <mich.talebza...@gmail.com> wrote: Hi, I notice buiding with sbt if I define my case class outside of main method like below it works case class Accounts( TransactionDate: String, TransactionType: String, Description: String, Value: Double, Balance: Double, AccountName: String, AccountNumber : String) object Import_nw_10124772 { def main(args: Array[String]) { val conf = new SparkConf(). setAppName("Import_nw_10124772"). setMaster("local[12]"). set("spark.driver.allowMultipleContexts", "true"). set("spark.hadoop.validateOutputSpecs", "false") val sc = new SparkContext(conf) // Create sqlContext based on HiveContext val sqlContext = new HiveContext(sc) import sqlContext.implicits._ val HiveContext = new org.apache.spark.sql.hive.HiveContext(sc) println ("\nStarted at"); sqlContext.sql("SELECT FROM_unixtime(unix_timestamp(), 'dd/MM/yyyy HH:mm:ss.ss') ").collect.foreach(println) // // Get a DF first based on Databricks CSV libraries ignore column heading because of column called "Type" // val df = sqlContext.read.format("com.databricks.spark.csv").option("inferSchema", "true").option("header", "true").load("hdfs://rhes564:9000/data/stg/accounts/nw/10124772") //df.printSchema // val a = df.filter(col("Date") > "").map(p => Accounts(p(0).toString,p(1).toString,p(2).toString,p(3).toString.toDouble,p(4).toString.toDouble,p(5).toString,p(6).toString)) However, if I put that case class with the main method, it throws "No TypeTag available for Accounts" error Apparently when case class is defined inside of the method that it is being used, it is not fully defined at that point. Is this a bug within Spark? Thanks Dr Mich Talebzadeh LinkedIn https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw http://talebzadehmich.wordpress.com