To store to csv file, you can use Spark-CSV
<https://github.com/databricks/spark-csv> library.

On Mon, Mar 23, 2015 at 5:35 PM, BASAK, ANANDA <ab9...@att.com> wrote:

>  Thanks. This worked well as per your suggestions. I had to run following:
>
> val TABLE_A =
> sc.textFile("/Myhome/SPARK/files/table_a_file.txt").map(_.split("|")).map(p
> => ROW_A(p(0).trim.toLong, p(1), p(2).trim.toInt, p(3), BigDecimal(p(4)),
> BigDecimal(p(5)), BigDecimal(p(6))))
>
>
>
> Now I am stuck at another step. I have run a SQL query, where I am
> Selecting from all the fields with some where clause , TSTAMP filtered with
> date range and order by TSTAMP clause. That is running fine.
>
>
>
> Then I am trying to store the output in a CSV file. I am using
> saveAsTextFile(“filename”) function. But it is giving error. Can you please
> help me to write a proper syntax to store output in a CSV file?
>
>
>
>
>
> Thanks & Regards
>
> -----------------------
>
> Ananda Basak
>
> Ph: 425-213-7092
>
>
>
> *From:* BASAK, ANANDA
> *Sent:* Tuesday, March 17, 2015 3:08 PM
> *To:* Yin Huai
> *Cc:* user@spark.apache.org
> *Subject:* RE: Date and decimal datatype not working
>
>
>
> Ok, thanks for the suggestions. Let me try and will confirm all.
>
>
>
> Regards
>
> Ananda
>
>
>
> *From:* Yin Huai [mailto:yh...@databricks.com]
> *Sent:* Tuesday, March 17, 2015 3:04 PM
> *To:* BASAK, ANANDA
> *Cc:* user@spark.apache.org
> *Subject:* Re: Date and decimal datatype not working
>
>
>
> p(0) is a String. So, you need to explicitly convert it to a Long. e.g.
> p(0).trim.toLong. You also need to do it for p(2). For those BigDecimals
> value, you need to create BigDecimal objects from your String values.
>
>
>
> On Tue, Mar 17, 2015 at 5:55 PM, BASAK, ANANDA <ab9...@att.com> wrote:
>
>   Hi All,
>
> I am very new in Spark world. Just started some test coding from last
> week. I am using spark-1.2.1-bin-hadoop2.4 and scala coding.
>
> I am having issues while using Date and decimal data types. Following is
> my code that I am simply running on scala prompt. I am trying to define a
> table and point that to my flat file containing raw data (pipe delimited
> format). Once that is done, I will run some SQL queries and put the output
> data in to another flat file with pipe delimited format.
>
>
>
> *******************************************************
>
> val sqlContext = new org.apache.spark.sql.SQLContext(sc)
>
> import sqlContext.createSchemaRDD
>
>
>
>
>
> // Define row and table
>
> case class ROW_A(
>
>   TSTAMP:           Long,
>
>   USIDAN:             String,
>
>   SECNT:                Int,
>
>   SECT:                   String,
>
>   BLOCK_NUM:        BigDecimal,
>
>   BLOCK_DEN:        BigDecimal,
>
>   BLOCK_PCT:        BigDecimal)
>
>
>
> val TABLE_A =
> sc.textFile("/Myhome/SPARK/files/table_a_file.txt").map(_.split("|")).map(p
> => ROW_A(p(0), p(1), p(2), p(3), p(4), p(5), p(6)))
>
>
>
> TABLE_A.registerTempTable("TABLE_A")
>
>
>
> ***************************************************
>
>
>
> The second last command is giving error, like following:
>
> <console>:17: error: type mismatch;
>
> found   : String
>
> required: Long
>
>
>
> Looks like the content from my flat file are considered as String always
> and not as Date or decimal. How can I make Spark to take them as Date or
> decimal types?
>
>
>
> Regards
>
> Ananda
>
>
>

Reply via email to