Paulo Magalhaes created SPARK-12479:
---------------------------------------

             Summary:  sparkR collect on GroupedData  throws R error "missing 
value where TRUE/FALSE needed"
                 Key: SPARK-12479
                 URL: https://issues.apache.org/jira/browse/SPARK-12479
             Project: Spark
          Issue Type: Bug
          Components: R, SparkR
    Affects Versions: 1.5.1
            Reporter: Paulo Magalhaes


sparkR collect on GroupedData  throws "missing value where TRUE/FALSE needed"

Spark Version: 1.5.1
R Version: 3.2.2

I tracked down the root cause of this exception to an specific key for which 
the hashCode could not be calculated.

The following code recreates the problem when ran in sparkR:

hashCode <- getFromNamespace("hashCode","SparkR")
hashCode("bc53d3605e8a5b7de1e8e271c2317645")
Error in if (value > .Machine$integer.max) { :
  missing value where TRUE/FALSE needed

I went one step further and relaised the the problem happens because of the  
bit wise shift below returning NA.

bitwShiftL(-1073741824,1)

where bitwShiftL is an R function. 
I believe the bitwShiftL function is working as it is supposed to. Therefore, 
my PR will fix it in the SparkR package.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to