Github user wangmiao1981 commented on the issue:

    https://github.com/apache/spark/pull/15421
  
    @shivaram We found that the negative index error happens in some version of 
R. For example, on my mac, R version 3.3.0 (2016-05-03) and the previous 
Windows test 3.3.2. For the failed cases, I put debug message and tried to read 
the byte array. For the field `NA`, the failed case doesn't serialize the 
length as integer (i.e., the byte array only includes `NA` but no the length 
`3`). Therefore, when readString reads the byte array in the order of length 
and string, it actually reads `NA` as an integer, which is negative. 
    
    When creating Dateframe, it serializes the data.frame as a `jobj`. I 
checked for both good and bad cases, but I didn't find any differences between 
the two. I didn't find a way to debug the `jobj`  serialization logic as it 
just writes binary in batch. Maybe, I can try again to dump the binary stream. 
On the surface, I didn't find the reason why length is not serialized for the 
failed case. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to