I guess on a technicality the docs just say first item in this RDD, not
first line in the source text file. AFAIK there is no way apart from
filtering to remove header lines
http://stackoverflow.com/a/24734612/877069.
As long as first() always returns the same value for a given RDD, I think
it's fine, no?
Nick
On Sun Feb 22 2015 at 9:09:01 PM Michael Malak
michaelma...@yahoo.com.invalid wrote:
Since RDDs are generally unordered, aren't things like textFile().first()
not guaranteed to return the first row (such as looking for a header row)?
If so, doesn't that make the example in
http://spark.apache.org/docs/1.2.1/quick-start.html#basics misleading?
-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org