[
https://issues.apache.org/jira/browse/SPARK-17633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15529252#comment-15529252
]
Ding Fei commented on SPARK-17633:
--
I think the count problem could be viewed as a bug issue.
HadoopRDD
[
https://issues.apache.org/jira/browse/SPARK-17633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15513182#comment-15513182
]
Sean Owen commented on SPARK-17633:
---
The issue is more at the HDFS API level, which Spark uses to read
[
https://issues.apache.org/jira/browse/SPARK-17633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15513168#comment-15513168
]
Anshul commented on SPARK-17633:
What could be the possible reason for this? As spark's transformations
[
https://issues.apache.org/jira/browse/SPARK-17633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15513125#comment-15513125
]
Sean Owen commented on SPARK-17633:
---
Yeah I can reproduce that. It is weird behavior, but, it's due to
[
https://issues.apache.org/jira/browse/SPARK-17633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15513116#comment-15513116
]
Anshul commented on SPARK-17633:
data.csv
1,"a"
2,"b"
val x=sc.textFile("data.csv")
x.count is 2
If I
[
https://issues.apache.org/jira/browse/SPARK-17633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15513110#comment-15513110
]
Anshul commented on SPARK-17633:
RDD is not cached, in this scenario.
> texFile() and wholeTextFiles()
[
https://issues.apache.org/jira/browse/SPARK-17633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15513013#comment-15513013
]
Sean Owen commented on SPARK-17633:
---
It's not clear what you're reporting. textFiles and wholeTextFiles