spark git commit: docs: fix typo
Repository: spark Updated Branches: refs/heads/master 01452ea9c -> 1d7db65e9 docs: fix typo no => no[t] ## What changes were proposed in this pull request? Fixing a typo. ## How was this patch tested? Visual check of the docs. Please review http://spark.apache.org/contributing.html before opening a pull request. Author: Tom Saleeba Closes #21496 from tomsaleeba/patch-1. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/1d7db65e Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/1d7db65e Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/1d7db65e Branch: refs/heads/master Commit: 1d7db65e968de1c601e7f8b1ec9bc783ef2dbd01 Parents: 01452ea Author: Tom Saleeba Authored: Tue Jun 12 09:22:52 2018 -0500 Committer: Sean Owen Committed: Tue Jun 12 09:22:52 2018 -0500 -- sql/core/src/main/scala/org/apache/spark/sql/Column.scala | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/1d7db65e/sql/core/src/main/scala/org/apache/spark/sql/Column.scala -- diff --git a/sql/core/src/main/scala/org/apache/spark/sql/Column.scala b/sql/core/src/main/scala/org/apache/spark/sql/Column.scala index 2dbb53e..4eee3de 100644 --- a/sql/core/src/main/scala/org/apache/spark/sql/Column.scala +++ b/sql/core/src/main/scala/org/apache/spark/sql/Column.scala @@ -104,7 +104,7 @@ class TypedColumn[-T, U]( * * {{{ * df("columnName")// On a specific `df` DataFrame. - * col("columnName") // A generic column no yet associated with a DataFrame. + * col("columnName") // A generic column not yet associated with a DataFrame. * col("columnName.field") // Extracting a struct field * col("`a.column.with.dots`") // Escape `.` in column names. * $"columnName" // Scala short hand for a named column. - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
spark git commit: [DOCS] Fix typo in docs
Repository: spark Updated Branches: refs/heads/master f27e02476 -> 7c61c2a1c [DOCS] Fix typo in docs ## What changes were proposed in this pull request? Fix typo in docs ## How was this patch tested? Author: uncleGenCloses #16658 from uncleGen/typo-issue. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/7c61c2a1 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/7c61c2a1 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/7c61c2a1 Branch: refs/heads/master Commit: 7c61c2a1c40629311b84dff8d91b257efb345d07 Parents: f27e024 Author: uncleGen Authored: Tue Jan 24 11:32:11 2017 + Committer: Sean Owen Committed: Tue Jan 24 11:32:11 2017 + -- docs/configuration.md| 2 +- docs/index.md| 2 +- docs/programming-guide.md| 6 +++--- docs/streaming-kafka-0-10-integration.md | 2 +- docs/submitting-applications.md | 2 +- 5 files changed, 7 insertions(+), 7 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/7c61c2a1/docs/configuration.md -- diff --git a/docs/configuration.md b/docs/configuration.md index a6b1f15..b7f10e6 100644 --- a/docs/configuration.md +++ b/docs/configuration.md @@ -435,7 +435,7 @@ Apart from these, the following properties are also available, and may be useful spark.jars.packages -Comma-separated list of maven coordinates of jars to include on the driver and executor +Comma-separated list of Maven coordinates of jars to include on the driver and executor classpaths. The coordinates should be groupId:artifactId:version. If spark.jars.ivySettings is given artifacts will be resolved according to the configuration in the file, otherwise artifacts will be searched for in the local maven repo, then maven central and finally any additional remote http://git-wip-us.apache.org/repos/asf/spark/blob/7c61c2a1/docs/index.md -- diff --git a/docs/index.md b/docs/index.md index 57b9fa8..023e06a 100644 --- a/docs/index.md +++ b/docs/index.md @@ -15,7 +15,7 @@ It also supports a rich set of higher-level tools including [Spark SQL](sql-prog Get Spark from the [downloads page](http://spark.apache.org/downloads.html) of the project website. This documentation is for Spark version {{site.SPARK_VERSION}}. Spark uses Hadoop's client libraries for HDFS and YARN. Downloads are pre-packaged for a handful of popular Hadoop versions. Users can also download a "Hadoop free" binary and run Spark with any Hadoop version [by augmenting Spark's classpath](hadoop-provided.html). -Scala and Java users can include Spark in their projects using its maven cooridnates and in the future Python users can also install Spark from PyPI. +Scala and Java users can include Spark in their projects using its Maven coordinates and in the future Python users can also install Spark from PyPI. If you'd like to build Spark from http://git-wip-us.apache.org/repos/asf/spark/blob/7c61c2a1/docs/programming-guide.md -- diff --git a/docs/programming-guide.md b/docs/programming-guide.md index a4017b5..db8b048 100644 --- a/docs/programming-guide.md +++ b/docs/programming-guide.md @@ -185,7 +185,7 @@ In the Spark shell, a special interpreter-aware SparkContext is already created variable called `sc`. Making your own SparkContext will not work. You can set which master the context connects to using the `--master` argument, and you can add JARs to the classpath by passing a comma-separated list to the `--jars` argument. You can also add dependencies -(e.g. Spark Packages) to your shell session by supplying a comma-separated list of maven coordinates +(e.g. Spark Packages) to your shell session by supplying a comma-separated list of Maven coordinates to the `--packages` argument. Any additional repositories where dependencies might exist (e.g. Sonatype) can be passed to the `--repositories` argument. For example, to run `bin/spark-shell` on exactly four cores, use: @@ -200,7 +200,7 @@ Or, to also add `code.jar` to its classpath, use: $ ./bin/spark-shell --master local[4] --jars code.jar {% endhighlight %} -To include a dependency using maven coordinates: +To include a dependency using Maven coordinates: {% highlight bash %} $ ./bin/spark-shell --master local[4] --packages "org.example:example:0.1" @@ -217,7 +217,7 @@ In the PySpark shell, a special interpreter-aware SparkContext is already create variable called `sc`. Making your own SparkContext will not work. You can set which master the context
spark git commit: [DOCS] Fix typo for Python section on unifying Kafka streams
Repository: spark Updated Branches: refs/heads/branch-1.6 7b4d7abfc -> 006d73a74 [DOCS] Fix typo for Python section on unifying Kafka streams 1) kafkaStreams is a list. The list should be unpacked when passing it into the streaming context union method, which accepts a variable number of streams. 2) print() should be pprint() for pyspark. This contribution is my original work, and I license the work to the project under the project's open source license. Author: chriskang90Closes #9545 from c-kang/streaming_python_typo. (cherry picked from commit 874cd66d4b6d156d0ef112a3d0f3bc5683c6a0ec) Signed-off-by: Sean Owen Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/006d73a7 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/006d73a7 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/006d73a7 Branch: refs/heads/branch-1.6 Commit: 006d73a741f92840f747a80c372f2d3f49fe7a1f Parents: 7b4d7ab Author: chriskang90 Authored: Mon Nov 9 19:39:22 2015 +0100 Committer: Sean Owen Committed: Mon Nov 9 19:39:33 2015 +0100 -- docs/streaming-programming-guide.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/006d73a7/docs/streaming-programming-guide.md -- diff --git a/docs/streaming-programming-guide.md b/docs/streaming-programming-guide.md index c751dbb..e9a27f4 100644 --- a/docs/streaming-programming-guide.md +++ b/docs/streaming-programming-guide.md @@ -1948,8 +1948,8 @@ unifiedStream.print(); {% highlight python %} numStreams = 5 kafkaStreams = [KafkaUtils.createStream(...) for _ in range (numStreams)] -unifiedStream = streamingContext.union(kafkaStreams) -unifiedStream.print() +unifiedStream = streamingContext.union(*kafkaStreams) +unifiedStream.pprint() {% endhighlight %} - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
spark git commit: [DOCS] Fix typo for Python section on unifying Kafka streams
Repository: spark Updated Branches: refs/heads/master cd174882a -> 874cd66d4 [DOCS] Fix typo for Python section on unifying Kafka streams 1) kafkaStreams is a list. The list should be unpacked when passing it into the streaming context union method, which accepts a variable number of streams. 2) print() should be pprint() for pyspark. This contribution is my original work, and I license the work to the project under the project's open source license. Author: chriskang90Closes #9545 from c-kang/streaming_python_typo. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/874cd66d Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/874cd66d Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/874cd66d Branch: refs/heads/master Commit: 874cd66d4b6d156d0ef112a3d0f3bc5683c6a0ec Parents: cd17488 Author: chriskang90 Authored: Mon Nov 9 19:39:22 2015 +0100 Committer: Sean Owen Committed: Mon Nov 9 19:39:22 2015 +0100 -- docs/streaming-programming-guide.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/874cd66d/docs/streaming-programming-guide.md -- diff --git a/docs/streaming-programming-guide.md b/docs/streaming-programming-guide.md index c751dbb..e9a27f4 100644 --- a/docs/streaming-programming-guide.md +++ b/docs/streaming-programming-guide.md @@ -1948,8 +1948,8 @@ unifiedStream.print(); {% highlight python %} numStreams = 5 kafkaStreams = [KafkaUtils.createStream(...) for _ in range (numStreams)] -unifiedStream = streamingContext.union(kafkaStreams) -unifiedStream.print() +unifiedStream = streamingContext.union(*kafkaStreams) +unifiedStream.pprint() {% endhighlight %} - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
spark git commit: [DOCS] Fix typo for Python section on unifying Kafka streams
Repository: spark Updated Branches: refs/heads/branch-1.5 6b314fe9e -> a33fd737c [DOCS] Fix typo for Python section on unifying Kafka streams 1) kafkaStreams is a list. The list should be unpacked when passing it into the streaming context union method, which accepts a variable number of streams. 2) print() should be pprint() for pyspark. This contribution is my original work, and I license the work to the project under the project's open source license. Author: chriskang90Closes #9545 from c-kang/streaming_python_typo. (cherry picked from commit 874cd66d4b6d156d0ef112a3d0f3bc5683c6a0ec) Signed-off-by: Sean Owen Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/a33fd737 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/a33fd737 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/a33fd737 Branch: refs/heads/branch-1.5 Commit: a33fd737cb5db7200bb8ebb080f729f85fcc7c47 Parents: 6b314fe Author: chriskang90 Authored: Mon Nov 9 19:39:22 2015 +0100 Committer: Sean Owen Committed: Mon Nov 9 19:39:45 2015 +0100 -- docs/streaming-programming-guide.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/a33fd737/docs/streaming-programming-guide.md -- diff --git a/docs/streaming-programming-guide.md b/docs/streaming-programming-guide.md index c751dbb..e9a27f4 100644 --- a/docs/streaming-programming-guide.md +++ b/docs/streaming-programming-guide.md @@ -1948,8 +1948,8 @@ unifiedStream.print(); {% highlight python %} numStreams = 5 kafkaStreams = [KafkaUtils.createStream(...) for _ in range (numStreams)] -unifiedStream = streamingContext.union(kafkaStreams) -unifiedStream.print() +unifiedStream = streamingContext.union(*kafkaStreams) +unifiedStream.pprint() {% endhighlight %} - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
spark git commit: [DOCS] Fix typo in documentation for Java UDF registration
Repository: spark Updated Branches: refs/heads/master bd11b01eb - 35410614d [DOCS] Fix typo in documentation for Java UDF registration This contribution is my original work and I license the work to the project under the project's open source license Author: Matt Wise mw...@quixey.com Closes #6447 from wisematthew/fix-typo-in-java-udf-registration-doc and squashes the following commits: e7ef5f7 [Matt Wise] Fix typo in documentation for Java UDF registration Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/35410614 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/35410614 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/35410614 Branch: refs/heads/master Commit: 35410614deb7feea1c9d5cca00a6fa7970404f21 Parents: bd11b01 Author: Matt Wise mw...@quixey.com Authored: Wed May 27 22:39:19 2015 -0700 Committer: Reynold Xin r...@databricks.com Committed: Wed May 27 22:39:19 2015 -0700 -- docs/sql-programming-guide.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/35410614/docs/sql-programming-guide.md -- diff --git a/docs/sql-programming-guide.md b/docs/sql-programming-guide.md index 5b41c0e..ab646f6 100644 --- a/docs/sql-programming-guide.md +++ b/docs/sql-programming-guide.md @@ -1939,7 +1939,7 @@ sqlContext.udf.register(strLen, (s: String) = s.length()) div data-lang=java markdown=1 {% highlight java %} -sqlContext.udf().register(strLen, (String s) - { s.length(); }); +sqlContext.udf().register(strLen, (String s) - s.length(), DataTypes.IntegerType); {% endhighlight %} /div - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
spark git commit: [DOCS] Fix typo in documentation for Java UDF registration
Repository: spark Updated Branches: refs/heads/branch-1.4 7c342bdd9 - 63be026da [DOCS] Fix typo in documentation for Java UDF registration This contribution is my original work and I license the work to the project under the project's open source license Author: Matt Wise mw...@quixey.com Closes #6447 from wisematthew/fix-typo-in-java-udf-registration-doc and squashes the following commits: e7ef5f7 [Matt Wise] Fix typo in documentation for Java UDF registration (cherry picked from commit 35410614deb7feea1c9d5cca00a6fa7970404f21) Signed-off-by: Reynold Xin r...@databricks.com Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/63be026d Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/63be026d Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/63be026d Branch: refs/heads/branch-1.4 Commit: 63be026da3ebf6b77f37f2e950e3b8f516bdfcaa Parents: 7c342bd Author: Matt Wise mw...@quixey.com Authored: Wed May 27 22:39:19 2015 -0700 Committer: Reynold Xin r...@databricks.com Committed: Wed May 27 22:39:24 2015 -0700 -- docs/sql-programming-guide.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/63be026d/docs/sql-programming-guide.md -- diff --git a/docs/sql-programming-guide.md b/docs/sql-programming-guide.md index 5b41c0e..ab646f6 100644 --- a/docs/sql-programming-guide.md +++ b/docs/sql-programming-guide.md @@ -1939,7 +1939,7 @@ sqlContext.udf.register(strLen, (s: String) = s.length()) div data-lang=java markdown=1 {% highlight java %} -sqlContext.udf().register(strLen, (String s) - { s.length(); }); +sqlContext.udf().register(strLen, (String s) - s.length(), DataTypes.IntegerType); {% endhighlight %} /div - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
spark git commit: [DOCS] Fix typo in API for custom InputFormats based on the “new” MapReduce API
Repository: spark Updated Branches: refs/heads/branch-1.3 76e3e6527 - c5a5c6f61 [DOCS] Fix typo in API for custom InputFormats based on the ânewâ MapReduce API This looks like a simple typo ```SparkContext.newHadoopRDD``` instead of ```SparkContext.newAPIHadoopRDD``` as in actual http://spark.apache.org/docs/1.2.1/api/scala/index.html#org.apache.spark.SparkContext Author: Alexander abezzu...@nflabs.com Closes #4718 from bzz/hadoop-InputFormats-doc-fix and squashes the following commits: 680a4c4 [Alexander] Fix typo in docs on custom Hadoop InputFormats (cherry picked from commit a7f90390251ff62a0e10edf4c2eb876538597791) Signed-off-by: Sean Owen so...@cloudera.com Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/c5a5c6f6 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/c5a5c6f6 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/c5a5c6f6 Branch: refs/heads/branch-1.3 Commit: c5a5c6f618b89d712c13a236388fa67c136691ee Parents: 76e3e65 Author: Alexander abezzu...@nflabs.com Authored: Sun Feb 22 08:53:05 2015 + Committer: Sean Owen so...@cloudera.com Committed: Sun Feb 22 08:53:14 2015 + -- docs/programming-guide.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/c5a5c6f6/docs/programming-guide.md -- diff --git a/docs/programming-guide.md b/docs/programming-guide.md index 4e4af76..7b07018 100644 --- a/docs/programming-guide.md +++ b/docs/programming-guide.md @@ -335,7 +335,7 @@ Apart from text files, Spark's Scala API also supports several other data format * For [SequenceFiles](http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/mapred/SequenceFileInputFormat.html), use SparkContext's `sequenceFile[K, V]` method where `K` and `V` are the types of key and values in the file. These should be subclasses of Hadoop's [Writable](http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/io/Writable.html) interface, like [IntWritable](http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/io/IntWritable.html) and [Text](http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/io/Text.html). In addition, Spark allows you to specify native types for a few common Writables; for example, `sequenceFile[Int, String]` will automatically read IntWritables and Texts. -* For other Hadoop InputFormats, you can use the `SparkContext.hadoopRDD` method, which takes an arbitrary `JobConf` and input format class, key class and value class. Set these the same way you would for a Hadoop job with your input source. You can also use `SparkContext.newHadoopRDD` for InputFormats based on the new MapReduce API (`org.apache.hadoop.mapreduce`). +* For other Hadoop InputFormats, you can use the `SparkContext.hadoopRDD` method, which takes an arbitrary `JobConf` and input format class, key class and value class. Set these the same way you would for a Hadoop job with your input source. You can also use `SparkContext.newAPIHadoopRDD` for InputFormats based on the new MapReduce API (`org.apache.hadoop.mapreduce`). * `RDD.saveAsObjectFile` and `SparkContext.objectFile` support saving an RDD in a simple format consisting of serialized Java objects. While this is not as efficient as specialized formats like Avro, it offers an easy way to save any RDD. @@ -367,7 +367,7 @@ Apart from text files, Spark's Java API also supports several other data formats * For [SequenceFiles](http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/mapred/SequenceFileInputFormat.html), use SparkContext's `sequenceFile[K, V]` method where `K` and `V` are the types of key and values in the file. These should be subclasses of Hadoop's [Writable](http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/io/Writable.html) interface, like [IntWritable](http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/io/IntWritable.html) and [Text](http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/io/Text.html). -* For other Hadoop InputFormats, you can use the `JavaSparkContext.hadoopRDD` method, which takes an arbitrary `JobConf` and input format class, key class and value class. Set these the same way you would for a Hadoop job with your input source. You can also use `JavaSparkContext.newHadoopRDD` for InputFormats based on the new MapReduce API (`org.apache.hadoop.mapreduce`). +* For other Hadoop InputFormats, you can use the `JavaSparkContext.hadoopRDD` method, which takes an arbitrary `JobConf` and input format class, key class and value class. Set these the same way you would for a Hadoop job with your input source. You can also use
spark git commit: [DOCS] Fix typo in API for custom InputFormats based on the “new” MapReduce API
Repository: spark Updated Branches: refs/heads/master 46462ff25 - a7f903902 [DOCS] Fix typo in API for custom InputFormats based on the ânewâ MapReduce API This looks like a simple typo ```SparkContext.newHadoopRDD``` instead of ```SparkContext.newAPIHadoopRDD``` as in actual http://spark.apache.org/docs/1.2.1/api/scala/index.html#org.apache.spark.SparkContext Author: Alexander abezzu...@nflabs.com Closes #4718 from bzz/hadoop-InputFormats-doc-fix and squashes the following commits: 680a4c4 [Alexander] Fix typo in docs on custom Hadoop InputFormats Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/a7f90390 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/a7f90390 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/a7f90390 Branch: refs/heads/master Commit: a7f90390251ff62a0e10edf4c2eb876538597791 Parents: 46462ff Author: Alexander abezzu...@nflabs.com Authored: Sun Feb 22 08:53:05 2015 + Committer: Sean Owen so...@cloudera.com Committed: Sun Feb 22 08:53:05 2015 + -- docs/programming-guide.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/a7f90390/docs/programming-guide.md -- diff --git a/docs/programming-guide.md b/docs/programming-guide.md index 4e4af76..7b07018 100644 --- a/docs/programming-guide.md +++ b/docs/programming-guide.md @@ -335,7 +335,7 @@ Apart from text files, Spark's Scala API also supports several other data format * For [SequenceFiles](http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/mapred/SequenceFileInputFormat.html), use SparkContext's `sequenceFile[K, V]` method where `K` and `V` are the types of key and values in the file. These should be subclasses of Hadoop's [Writable](http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/io/Writable.html) interface, like [IntWritable](http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/io/IntWritable.html) and [Text](http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/io/Text.html). In addition, Spark allows you to specify native types for a few common Writables; for example, `sequenceFile[Int, String]` will automatically read IntWritables and Texts. -* For other Hadoop InputFormats, you can use the `SparkContext.hadoopRDD` method, which takes an arbitrary `JobConf` and input format class, key class and value class. Set these the same way you would for a Hadoop job with your input source. You can also use `SparkContext.newHadoopRDD` for InputFormats based on the new MapReduce API (`org.apache.hadoop.mapreduce`). +* For other Hadoop InputFormats, you can use the `SparkContext.hadoopRDD` method, which takes an arbitrary `JobConf` and input format class, key class and value class. Set these the same way you would for a Hadoop job with your input source. You can also use `SparkContext.newAPIHadoopRDD` for InputFormats based on the new MapReduce API (`org.apache.hadoop.mapreduce`). * `RDD.saveAsObjectFile` and `SparkContext.objectFile` support saving an RDD in a simple format consisting of serialized Java objects. While this is not as efficient as specialized formats like Avro, it offers an easy way to save any RDD. @@ -367,7 +367,7 @@ Apart from text files, Spark's Java API also supports several other data formats * For [SequenceFiles](http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/mapred/SequenceFileInputFormat.html), use SparkContext's `sequenceFile[K, V]` method where `K` and `V` are the types of key and values in the file. These should be subclasses of Hadoop's [Writable](http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/io/Writable.html) interface, like [IntWritable](http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/io/IntWritable.html) and [Text](http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/io/Text.html). -* For other Hadoop InputFormats, you can use the `JavaSparkContext.hadoopRDD` method, which takes an arbitrary `JobConf` and input format class, key class and value class. Set these the same way you would for a Hadoop job with your input source. You can also use `JavaSparkContext.newHadoopRDD` for InputFormats based on the new MapReduce API (`org.apache.hadoop.mapreduce`). +* For other Hadoop InputFormats, you can use the `JavaSparkContext.hadoopRDD` method, which takes an arbitrary `JobConf` and input format class, key class and value class. Set these the same way you would for a Hadoop job with your input source. You can also use `JavaSparkContext.newAPIHadoopRDD` for InputFormats based on the new MapReduce API (`org.apache.hadoop.mapreduce`). * `JavaRDD.saveAsObjectFile`
spark git commit: [DOCS] Fix typo in return type of cogroup
Repository: spark Updated Branches: refs/heads/master e200ac8e5 - f6b852aad [DOCS] Fix typo in return type of cogroup This fixes a simple typo in the cogroup docs noted in http://mail-archives.apache.org/mod_mbox/spark-user/201501.mbox/%3CCAMAsSdJ8_24evMAMg7fOZCQjwimisbYWa9v8BN6Rc3JCauja6wmail.gmail.com%3E I didn't bother with a JIRA Author: Sean Owen so...@cloudera.com Closes #4072 from srowen/CogroupDocFix and squashes the following commits: 43c850b [Sean Owen] Fix typo in return type of cogroup Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/f6b852aa Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/f6b852aa Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/f6b852aa Branch: refs/heads/master Commit: f6b852aade7668c99f37c69f606c64763cb265d2 Parents: e200ac8 Author: Sean Owen so...@cloudera.com Authored: Fri Jan 16 09:28:44 2015 -0800 Committer: Andrew Or and...@databricks.com Committed: Fri Jan 16 09:28:44 2015 -0800 -- docs/programming-guide.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/f6b852aa/docs/programming-guide.md -- diff --git a/docs/programming-guide.md b/docs/programming-guide.md index 5e0d5c1..0211bba 100644 --- a/docs/programming-guide.md +++ b/docs/programming-guide.md @@ -913,7 +913,7 @@ for details. /tr tr td bcogroup/b(iotherDataset/i, [inumTasks/i]) /td - td When called on datasets of type (K, V) and (K, W), returns a dataset of (K, Iterablelt;Vgt;, Iterablelt;Wgt;) tuples. This operation is also called codegroupWith/code. /td + td When called on datasets of type (K, V) and (K, W), returns a dataset of (K, (Iterablelt;Vgt;, Iterablelt;Wgt;)) tuples. This operation is also called codegroupWith/code. /td /tr tr td bcartesian/b(iotherDataset/i) /td - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
spark git commit: [DOCS] Fix typo in return type of cogroup
Repository: spark Updated Branches: refs/heads/branch-1.1 810a2248f - c411182c6 [DOCS] Fix typo in return type of cogroup This fixes a simple typo in the cogroup docs noted in http://mail-archives.apache.org/mod_mbox/spark-user/201501.mbox/%3CCAMAsSdJ8_24evMAMg7fOZCQjwimisbYWa9v8BN6Rc3JCauja6wmail.gmail.com%3E I didn't bother with a JIRA Author: Sean Owen so...@cloudera.com Closes #4072 from srowen/CogroupDocFix and squashes the following commits: 43c850b [Sean Owen] Fix typo in return type of cogroup (cherry picked from commit f6b852aade7668c99f37c69f606c64763cb265d2) Signed-off-by: Andrew Or and...@databricks.com Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/c411182c Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/c411182c Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/c411182c Branch: refs/heads/branch-1.1 Commit: c411182c640940fe577c0529d2df0d2cd7fc0718 Parents: 810a224 Author: Sean Owen so...@cloudera.com Authored: Fri Jan 16 09:28:44 2015 -0800 Committer: Andrew Or and...@databricks.com Committed: Fri Jan 16 09:28:59 2015 -0800 -- docs/programming-guide.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/c411182c/docs/programming-guide.md -- diff --git a/docs/programming-guide.md b/docs/programming-guide.md index 6ae780d..999bea0 100644 --- a/docs/programming-guide.md +++ b/docs/programming-guide.md @@ -911,7 +911,7 @@ for details. /tr tr td bcogroup/b(iotherDataset/i, [inumTasks/i]) /td - td When called on datasets of type (K, V) and (K, W), returns a dataset of (K, Iterablelt;Vgt;, Iterablelt;Wgt;) tuples. This operation is also called codegroupWith/code. /td + td When called on datasets of type (K, V) and (K, W), returns a dataset of (K, (Iterablelt;Vgt;, Iterablelt;Wgt;)) tuples. This operation is also called codegroupWith/code. /td /tr tr td bcartesian/b(iotherDataset/i) /td - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
spark git commit: [DOCS] Fix typo in return type of cogroup
Repository: spark Updated Branches: refs/heads/branch-1.0 b530bc92a - a1425db96 [DOCS] Fix typo in return type of cogroup This fixes a simple typo in the cogroup docs noted in http://mail-archives.apache.org/mod_mbox/spark-user/201501.mbox/%3CCAMAsSdJ8_24evMAMg7fOZCQjwimisbYWa9v8BN6Rc3JCauja6wmail.gmail.com%3E I didn't bother with a JIRA Author: Sean Owen so...@cloudera.com Closes #4072 from srowen/CogroupDocFix and squashes the following commits: 43c850b [Sean Owen] Fix typo in return type of cogroup (cherry picked from commit f6b852aade7668c99f37c69f606c64763cb265d2) Signed-off-by: Andrew Or and...@databricks.com Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/a1425db9 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/a1425db9 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/a1425db9 Branch: refs/heads/branch-1.0 Commit: a1425db96a1126ef326a094eaa8abfa575db4b55 Parents: b530bc9 Author: Sean Owen so...@cloudera.com Authored: Fri Jan 16 09:28:44 2015 -0800 Committer: Andrew Or and...@databricks.com Committed: Fri Jan 16 09:29:13 2015 -0800 -- docs/programming-guide.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/a1425db9/docs/programming-guide.md -- diff --git a/docs/programming-guide.md b/docs/programming-guide.md index c04c3fa..7e6c089 100644 --- a/docs/programming-guide.md +++ b/docs/programming-guide.md @@ -835,7 +835,7 @@ for details. /tr tr td bcogroup/b(iotherDataset/i, [inumTasks/i]) /td - td When called on datasets of type (K, V) and (K, W), returns a dataset of (K, Iterablelt;Vgt;, Iterablelt;Wgt;) tuples. This operation is also called codegroupWith/code. /td + td When called on datasets of type (K, V) and (K, W), returns a dataset of (K, (Iterablelt;Vgt;, Iterablelt;Wgt;)) tuples. This operation is also called codegroupWith/code. /td /tr tr td bcartesian/b(iotherDataset/i) /td - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org