You can read as string, write a map to fix rows and then convert back to your desired Dataframe. On 28 Sep 2016 06:49, "Mich Talebzadeh" <mich.talebza...@gmail.com> wrote:
> > I have historical prices for various stocks. > > Each csv file has 10 years trade one row per each day. > > These are the columns defined in the class > > case class columns(Stock: String, Ticker: String, TradeDate: String, Open: > Float, High: Float, Low: Float, Close: Float, Volume: Integer) > > The issue is with Open, High, Low, Close columns that all are defined as > Float. > > Most rows are OK like below but the red one with "-" defined as Float > causes issues > > Date Open High Low Close Volume > 27-Sep-16 80.91 80.93 79.87 80.85 1873158 > 23-Dec-11 - - - 40.56 0 > > Because the prices are defined as Float, these rows cause the application > to crash > scala> val rs = df2.filter(changeToDate("TradeDate") >= > monthsago).select((changeToDate("TradeDate").as(" > TradeDate")),(('Close+'Open)/2).as("AverageDailyPrice"), 'Low.as("Day's > Low"), 'High.as("Day's High")).orderBy("TradeDate").collect > 16/09/27 21:48:53 ERROR Executor: Exception in task 0.0 in stage 61.0 (TID > 260) > java.lang.NumberFormatException: For input string: "-" > > > One way is to define the prices as Strings but that is not > meaningful. Alternatively do the clean up before putting csv in HDFS but > that becomes tedious and error prone. > > Any ideas will be appreciated. > > > Dr Mich Talebzadeh > > > > LinkedIn * > https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw > <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* > > > > http://talebzadehmich.wordpress.com > > > *Disclaimer:* Use it at your own risk. Any and all responsibility for any > loss, damage or destruction of data or any other property which may arise > from relying on this email's technical content is explicitly disclaimed. > The author will in no case be liable for any monetary damages arising from > such loss, damage or destruction. > > >