Github user srowen commented on a diff in the pull request:
https://github.com/apache/spark/pull/16137#discussion_r91170321
--- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala ---
@@ -953,24 +977,24 @@ class SparkContext(config: SparkConf) extends Logging
{
}
/**
- * Get an RDD for a Hadoop-readable dataset from a Hadoop JobConf given
its InputFormat and other
- * necessary info (e.g. file name for a filesystem-based dataset, table
name for HyperTable),
- * using the older MapReduce API (`org.apache.hadoop.mapred`).
+ * Get an RDD for a Hadoop-readable dataset from a Hadoop `JobConf`
given its `InputFormat`
+ * and other necessary info (e.g. file name for a filesystem-based
dataset, table name
+ * for HyperTable), using the older MapReduce API
(`org.apache.hadoop.mapred`).
*
- * @param conf JobConf for setting up the dataset. Note: This will be
put into a Broadcast.
+ * @note because Hadoop's `RecordReader` class re-uses the same Writable
object for each
--- End diff --
Do you need to move this? you don't need to indent it
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]