[ 
https://issues.apache.org/jira/browse/SPARK-11553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15000616#comment-15000616
 ] 

Bartlomiej Alberski commented on SPARK-11553:
---------------------------------------------

Ok. I think that I know what is the problem. It can be reproduced with scala 
2.11.6 and DataFrame API.

If you are using DataFrame API from scala and you are trying to get 
Int|Long|Boolean etc - value that extends AnyVal, you will receive "zero value" 
specific for given type (0 for Long and Int, false for Boolean etc), while API 
suggest that NPE will be raised.

Example modified in order to ilustrate problem (from 
http://spark.apache.org/docs/latest/sql-programming-guide.html#dataframes)
{code}
val sc: SparkContext // An existing SparkContext.
val sqlContext = new org.apache.spark.sql.SQLContext(sc)

val df = sqlContext.read.json("examples/src/main/resources/people.json")

// Displays the content of the DataFrame to stdout
df.show()
val res = df.map(x => x.getLong(x.fieldIndex("name"))).collect()
println(res.mkString(",")
{code}

Problem comes from implementation of getInt|Float|Boolean|... methods: 
{code}
getInt(i: Int): Int = getAs[Int](i)
getAs[T](i: Int): T = get(i).asInstanceOf[T]
{code}

null.asInstanceOf[Long] returns 0 (because Long cannot be null because it 
extends AnyVal)

Examplary invocations from scala REPL
{code}
scala> null.asInstanceOf[Int]
res0: Int = 0

scala> null.asInstanceOf[Long]
res1: Long = 0

scala> null.asInstanceOf[Short]
res2: Short = 0

scala> null.asInstanceOf[Boolean]
res3: Boolean = false

scala> null.asInstanceOf[Double]
res4: Double = 0.0

scala> null.asInstanceOf[Float]
res5: Float = 0.0
{code}

I will be more than happy to prepare PR solving this issue.

> row.getInt(i) if row[i]=null returns 0
> --------------------------------------
>
>                 Key: SPARK-11553
>                 URL: https://issues.apache.org/jira/browse/SPARK-11553
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>            Reporter: Tofigh
>            Priority: Minor
>
> row.getInt|Float|Double in SPARK RDD return 0 if row[index] is null. (Even 
> according to the document they should throw nullException error)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to