Repository: spark
Updated Branches:
refs/heads/branch-1.4 b18f1c61a -> 7bd527427
[SPARK-7526] [SPARKR] Specify ip of RBackend, MonitorServer and RRDD Socket
server
These R process only used to communicate with JVM process on local, so binding
to localhost is more reasonable then wildcard ip.
Repository: spark
Updated Branches:
refs/heads/master df9b94a57 -> 98195c303
[SPARK-7526] [SPARKR] Specify ip of RBackend, MonitorServer and RRDD Socket
server
These R process only used to communicate with JVM process on local, so binding
to localhost is more reasonable then wildcard ip.
Au
Repository: spark
Updated Branches:
refs/heads/branch-1.4 6ff3379a1 -> b18f1c61a
[SPARK-7482] [SPARKR] Rename some DataFrame API methods in SparkR to match
their counterparts in Scala.
Author: Sun Rui
Closes #6007 from sun-rui/SPARK-7482 and squashes the following commits:
5c5cf5e [Sun Rui
Repository: spark
Updated Branches:
refs/heads/master 208b90225 -> df9b94a57
[SPARK-7482] [SPARKR] Rename some DataFrame API methods in SparkR to match
their counterparts in Scala.
Author: Sun Rui
Closes #6007 from sun-rui/SPARK-7482 and squashes the following commits:
5c5cf5e [Sun Rui] Im
Repository: spark
Updated Branches:
refs/heads/branch-1.4 219a9043e -> 6ff3379a1
[SPARK-7566][SQL] Add type to HiveContext.analyzer
This makes HiveContext.analyzer overrideable.
Author: Santiago M. Mola
Closes #6086 from smola/patch-3 and squashes the following commits:
8ece136 [Santiago M
Repository: spark
Updated Branches:
refs/heads/master 97dee313f -> 208b90225
[SPARK-7566][SQL] Add type to HiveContext.analyzer
This makes HiveContext.analyzer overrideable.
Author: Santiago M. Mola
Closes #6086 from smola/patch-3 and squashes the following commits:
8ece136 [Santiago M. Mo
Repository: spark
Updated Branches:
refs/heads/master 8fd55358b -> 97dee313f
[SPARK-7321][SQL] Add Column expression for conditional statements
(when/otherwise)
This builds on https://github.com/apache/spark/pull/5932 and should close
https://github.com/apache/spark/pull/5932 as well.
As an
Repository: spark
Updated Branches:
refs/heads/branch-1.4 bdd5db9f1 -> 219a9043e
[SPARK-7321][SQL] Add Column expression for conditional statements
(when/otherwise)
This builds on https://github.com/apache/spark/pull/5932 and should close
https://github.com/apache/spark/pull/5932 as well.
A
Repository: spark
Updated Branches:
refs/heads/master 1b9e434b6 -> 8fd55358b
http://git-wip-us.apache.org/repos/asf/spark/blob/8fd55358/sql/core/src/main/scala/org/apache/spark/sql/UDFRegistration.scala
--
diff --git a/sql/core
[SPARK-7588] Document all SQL/DataFrame public methods with @since tag
This pull request adds since tag to all public methods/classes in SQL/DataFrame
to indicate which version the methods/classes were first added.
Author: Reynold Xin
Closes #6101 from rxin/tbc and squashes the following commi
[SPARK-7588] Document all SQL/DataFrame public methods with @since tag
This pull request adds since tag to all public methods/classes in SQL/DataFrame
to indicate which version the methods/classes were first added.
Author: Reynold Xin
Closes #6101 from rxin/tbc and squashes the following commi
Repository: spark
Updated Branches:
refs/heads/branch-1.4 2cc330181 -> bdd5db9f1
http://git-wip-us.apache.org/repos/asf/spark/blob/bdd5db9f/sql/core/src/main/scala/org/apache/spark/sql/UDFRegistration.scala
--
diff --git a/sql/
Repository: spark
Updated Branches:
refs/heads/master 247b70349 -> 1b9e434b6
[SPARK-7592] Always set resolution to "Fixed" in PR merge script.
The issue is that the behavior of the ASF JIRA silently
changed. Now when the "Resolve Issue" transition occurs,
the default resolution is "Pending Clo
Repository: spark
Updated Branches:
refs/heads/branch-1.4 08ec1af54 -> 2cc330181
[HOTFIX] Use the old Job API to support old Hadoop versions
#5526 uses `Job.getInstance`, which does not exist in the old Hadoop versions.
Just use `new Job` to replace it.
cc liancheng
Author: zsxwing
Closes
Repository: spark
Updated Branches:
refs/heads/master 77f64c736 -> 247b70349
[HOTFIX] Use the old Job API to support old Hadoop versions
#5526 uses `Job.getInstance`, which does not exist in the old Hadoop versions.
Just use `new Job` to replace it.
cc liancheng
Author: zsxwing
Closes #60
Repository: spark
Updated Branches:
refs/heads/master 23f7d66d5 -> 77f64c736
[SPARK-7572] [MLLIB] do not import Param/Params under pyspark.ml
Remove `Param` and `Params` from `pyspark.ml` and add a section in the doc.
brkyvz
Author: Xiangrui Meng
Closes #6094 from mengxr/SPARK-7572 and squ
Repository: spark
Updated Branches:
refs/heads/branch-1.4 bb81b1500 -> 08ec1af54
[SPARK-7572] [MLLIB] do not import Param/Params under pyspark.ml
Remove `Param` and `Params` from `pyspark.ml` and add a section in the doc.
brkyvz
Author: Xiangrui Meng
Closes #6094 from mengxr/SPARK-7572 and
Repository: spark
Updated Branches:
refs/heads/branch-1.4 6c292a213 -> bb81b1500
[SPARK-7554] [STREAMING] Throw exception when an active/stopped
StreamingContext is used to create DStreams and output operations
Author: Tathagata Das
Closes #6099 from tdas/SPARK-7554 and squashes the followi
Repository: spark
Updated Branches:
refs/heads/master 2713bc65a -> 23f7d66d5
[SPARK-7554] [STREAMING] Throw exception when an active/stopped
StreamingContext is used to create DStreams and output operations
Author: Tathagata Das
Closes #6099 from tdas/SPARK-7554 and squashes the following c
Repository: spark
Updated Branches:
refs/heads/branch-1.4 91fbd93f2 -> 6c292a213
[SPARK-7528] [MLLIB] make RankingMetrics Java-friendly
`RankingMetrics` contains a ClassTag, which is hard to create in Java. This PR
adds a factory method `of` for Java users. coderxiang
Author: Xiangrui Meng
Repository: spark
Updated Branches:
refs/heads/master 00e7b09a0 -> 2713bc65a
[SPARK-7528] [MLLIB] make RankingMetrics Java-friendly
`RankingMetrics` contains a ClassTag, which is hard to create in Java. This PR
adds a factory method `of` for Java users. coderxiang
Author: Xiangrui Meng
Clo
Repository: spark
Updated Branches:
refs/heads/master 96c4846db -> 00e7b09a0
[SPARK-7553] [STREAMING] Added methods to maintain a singleton StreamingContext
In a REPL/notebook environment, its very easy to lose a reference to a
StreamingContext by overriding the variable name. So if you happe
Repository: spark
Updated Branches:
refs/heads/branch-1.4 612247ff0 -> 91fbd93f2
[SPARK-7553] [STREAMING] Added methods to maintain a singleton StreamingContext
In a REPL/notebook environment, its very easy to lose a reference to a
StreamingContext by overriding the variable name. So if you h
Repository: spark
Updated Branches:
refs/heads/branch-1.4 d080df10b -> 612247ff0
[SPARK-7573] [ML] OneVsRest cleanups
Minor cleanups discussed with [~mengxr]:
* move OneVsRest from reduction to classification sub-package
* make model constructor private
Some doc cleanups too
CC: harsha2010
Repository: spark
Updated Branches:
refs/heads/master f0c1bc347 -> 96c4846db
[SPARK-7573] [ML] OneVsRest cleanups
Minor cleanups discussed with [~mengxr]:
* move OneVsRest from reduction to classification sub-package
* make model constructor private
Some doc cleanups too
CC: harsha2010 Coul
Repository: spark
Updated Branches:
refs/heads/branch-1.4 fe34a5915 -> d080df10b
[SPARK-7557] [ML] [DOC] User guide for spark.ml HashingTF, Tokenizer
Added feature transformer subsection to spark.ml guide, with HashingTF and
Tokenizer. Added JavaHashingTFSuite to test Java examples in new gu
Repository: spark
Updated Branches:
refs/heads/master 1d703660d -> f0c1bc347
[SPARK-7557] [ML] [DOC] User guide for spark.ml HashingTF, Tokenizer
Added feature transformer subsection to spark.ml guide, with HashingTF and
Tokenizer. Added JavaHashingTFSuite to test Java examples in new guide.
Repository: spark
Updated Branches:
refs/heads/branch-1.4 221375ee1 -> fe34a5915
[SPARK-7496] [MLLIB] Update Programming guide with Online LDA
jira: https://issues.apache.org/jira/browse/SPARK-7496
Update LDA subsection of clustering section of MLlib programming guide to
include OnlineLDA.
Repository: spark
Updated Branches:
refs/heads/master 1422e79e5 -> 1d703660d
[SPARK-7496] [MLLIB] Update Programming guide with Online LDA
jira: https://issues.apache.org/jira/browse/SPARK-7496
Update LDA subsection of clustering section of MLlib programming guide to
include OnlineLDA.
Auth
Repository: spark
Updated Branches:
refs/heads/master a4874b0d1 -> 1422e79e5
[SPARK-7406] [STREAMING] [WEBUI] Add tooltips for "Scheduling Delay",
"Processing Time" and "Total Delay"
Screenshots:
![screen shot 2015-05-06 at 2 29 03
pm](https://cloud.githubusercontent.com/assets/1000778/75041
Repository: spark
Updated Branches:
refs/heads/branch-1.4 217c9 -> 221375ee1
[SPARK-7406] [STREAMING] [WEBUI] Add tooltips for "Scheduling Delay",
"Processing Time" and "Total Delay"
Screenshots:
![screen shot 2015-05-06 at 2 29 03
pm](https://cloud.githubusercontent.com/assets/1000778/7
Repository: spark
Updated Branches:
refs/heads/branch-1.4 32819fcb7 -> 217c9
[SPARK-7571] [MLLIB] rename Math to math
`scala.Math` is deprecated since 2.8. This PR only touchs `Math` usages in
MLlib. dbtsai
Author: Xiangrui Meng
Closes #6092 from mengxr/SPARK-7571 and squashes the foll
Repository: spark
Updated Branches:
refs/heads/master 41d1c -> a4874b0d1
[SPARK-7571] [MLLIB] rename Math to math
`scala.Math` is deprecated since 2.8. This PR only touchs `Math` usages in
MLlib. dbtsai
Author: Xiangrui Meng
Closes #6092 from mengxr/SPARK-7571 and squashes the followin
Repository: spark
Updated Branches:
refs/heads/master 23b9863e2 -> 41d1c
[SPARK-7484][SQL]Support jdbc connection properties
Few jdbc drivers like SybaseIQ support passing username and password only
through connection properties. So the same needs to be supported for
SQLContext.jdbc, data
Repository: spark
Updated Branches:
refs/heads/branch-1.4 98ccd934f -> 32819fcb7
[SPARK-7484][SQL]Support jdbc connection properties
Few jdbc drivers like SybaseIQ support passing username and password only
through connection properties. So the same needs to be supported for
SQLContext.jdbc,
Repository: spark
Updated Branches:
refs/heads/branch-1.4 c68485e7a -> 98ccd934f
[SPARK-7559] [MLLIB] Bucketizer should include the right most boundary in the
last bucket.
We make special treatment for +inf in `Bucketizer`. This could be simplified by
always including the largest split value
Repository: spark
Updated Branches:
refs/heads/master 2a41c0d71 -> 23b9863e2
[SPARK-7559] [MLLIB] Bucketizer should include the right most boundary in the
last bucket.
We make special treatment for +inf in `Bucketizer`. This could be simplified by
always including the largest split value in
Repository: spark
Updated Branches:
refs/heads/branch-1.4 fd16709f0 -> c68485e7a
[SPARK-7569][SQL] Better error for invalid binary expressions
`scala> Seq((1,1)).toDF("a", "b").select(lit(1) + new java.sql.Date(1)) `
Before:
```
org.apache.spark.sql.AnalysisException: invalid expression (1 +
Repository: spark
Updated Branches:
refs/heads/master 595a67589 -> 2a41c0d71
[SPARK-7569][SQL] Better error for invalid binary expressions
`scala> Seq((1,1)).toDF("a", "b").select(lit(1) + new java.sql.Date(1)) `
Before:
```
org.apache.spark.sql.AnalysisException: invalid expression (1 + 0)
Repository: spark
Updated Branches:
refs/heads/master 5438f49cc -> 595a67589
[SPARK-7015] [MLLIB] [WIP] Multiclass to Binary Reduction: One Against All
initial cut of one against all. test code is a scaffolding , not fully
implemented.
This WIP is to gather early feedback.
Author: Ram Srihar
Repository: spark
Updated Branches:
refs/heads/branch-1.4 eadda926c -> fd16709f0
[SPARK-7015] [MLLIB] [WIP] Multiclass to Binary Reduction: One Against All
initial cut of one against all. test code is a scaffolding , not fully
implemented.
This WIP is to gather early feedback.
Author: Ram Sr
Repository: spark
Updated Branches:
refs/heads/branch-1.3 92fe5b649 -> 5f121fb60
[SPARK-7552] [STREAMING] [BACKPORT] Close WAL files correctly when iteration is
finished
tdas
Author: jerryshao
Closes #6069 from jerryshao/SPARK-7552-1.3-backpport and squashes the following
commits:
72b9fb
Repository: spark
Updated Branches:
refs/heads/branch-1.3 b152c6cc2 -> 92fe5b649
[SPARK-2018] [CORE] Upgrade LZF library to fix endian serialization pâ¦
â¦roblem
Pick up newer version of dependency with fix for SPARK-2018. The update
involved patching the ning/compress LZF library to hand
Repository: spark
Updated Branches:
refs/heads/branch-1.4 432694c18 -> eadda926c
[SPARK-2018] [CORE] Upgrade LZF library to fix endian serialization pâ¦
â¦roblem
Pick up newer version of dependency with fix for SPARK-2018. The update
involved patching the ning/compress LZF library to hand
Repository: spark
Updated Branches:
refs/heads/master 8e935b0a2 -> 5438f49cc
[SPARK-2018] [CORE] Upgrade LZF library to fix endian serialization pâ¦
â¦roblem
Pick up newer version of dependency with fix for SPARK-2018. The update
involved patching the ning/compress LZF library to handle b
Repository: spark
Updated Branches:
refs/heads/branch-1.4 ce6c40066 -> 432694c18
[SPARK-7487] [ML] Feature Parity in PySpark for ml.regression
Added LinearRegression Python API
Author: Burak Yavuz
Closes #6016 from brkyvz/ml-reg and squashes the following commits:
11c9ef9 [Burak Yavuz] add
Repository: spark
Updated Branches:
refs/heads/master b9b01f44f -> 8e935b0a2
[SPARK-7487] [ML] Feature Parity in PySpark for ml.regression
Added LinearRegression Python API
Author: Burak Yavuz
Closes #6016 from brkyvz/ml-reg and squashes the following commits:
11c9ef9 [Burak Yavuz] address
Repository: spark
Updated Branches:
refs/heads/branch-1.4 8be43f897 -> ce6c40066
[HOT FIX #6076] DAG visualization: curve the edges
Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/ce6c4006
Tree: http://git-wip-us.apache.or
Repository: spark
Updated Branches:
refs/heads/master 4e290522c -> b9b01f44f
[HOT FIX #6076] DAG visualization: curve the edges
Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/b9b01f44
Tree: http://git-wip-us.apache.org/re
Repository: spark
Updated Branches:
refs/heads/branch-1.4 a23610458 -> 8be43f897
[SPARK-7276] [DATAFRAME] speed up DataFrame.select by collapsing Project
Author: Wenchen Fan
Closes #5831 from cloud-fan/7276 and squashes the following commits:
ee4a1e1 [Wenchen Fan] fix rebase mistake
a3b565d
Repository: spark
Updated Branches:
refs/heads/master 65697bbea -> 4e290522c
[SPARK-7276] [DATAFRAME] speed up DataFrame.select by collapsing Project
Author: Wenchen Fan
Closes #5831 from cloud-fan/7276 and squashes the following commits:
ee4a1e1 [Wenchen Fan] fix rebase mistake
a3b565d [We
Repository: spark
Updated Branches:
refs/heads/branch-1.4 ec8928604 -> a23610458
http://git-wip-us.apache.org/repos/asf/spark/blob/a2361045/core/src/main/resources/org/apache/spark/ui/static/spark-dag-viz.js
--
diff --git
a/co
[SPARK-7500] DAG visualization: move cluster labeling to dagre-d3
This fixes the label bleeding issue described in the JIRA and pictured in the
screenshots below. I also took the opportunity to move some code to the places
that they belong more to. In particular:
(1) Drawing cluster labels is n
http://git-wip-us.apache.org/repos/asf/spark/blob/65697bbe/core/src/main/resources/org/apache/spark/ui/static/dagre-d3.min.js
--
diff --git a/core/src/main/resources/org/apache/spark/ui/static/dagre-d3.min.js
b/core/src/main/resou
[SPARK-7500] DAG visualization: move cluster labeling to dagre-d3
This fixes the label bleeding issue described in the JIRA and pictured in the
screenshots below. I also took the opportunity to move some code to the places
that they belong more to. In particular:
(1) Drawing cluster labels is n
Repository: spark
Updated Branches:
refs/heads/master bfcaf8adc -> 65697bbea
http://git-wip-us.apache.org/repos/asf/spark/blob/65697bbe/core/src/main/resources/org/apache/spark/ui/static/spark-dag-viz.js
--
diff --git
a/core/s
http://git-wip-us.apache.org/repos/asf/spark/blob/a2361045/core/src/main/resources/org/apache/spark/ui/static/dagre-d3.min.js
--
diff --git a/core/src/main/resources/org/apache/spark/ui/static/dagre-d3.min.js
b/core/src/main/resou
Repository: spark
Updated Branches:
refs/heads/branch-1.4 d2328137f -> ec8928604
[DataFrame][minor] support column in field accessor
Minor improvement, now we can use `Column` as extraction expression.
Author: Wenchen Fan
Closes #6080 from cloud-fan/tmp and squashes the following commits:
Repository: spark
Updated Branches:
refs/heads/master 0595b6de8 -> bfcaf8adc
[DataFrame][minor] support column in field accessor
Minor improvement, now we can use `Column` as extraction expression.
Author: Wenchen Fan
Closes #6080 from cloud-fan/tmp and squashes the following commits:
0fde
Repository: spark
Updated Branches:
refs/heads/branch-1.4 a9d84a9bf -> d2328137f
http://git-wip-us.apache.org/repos/asf/spark/blob/d2328137/sql/core/src/main/scala/org/apache/spark/sql/sources/interfaces.scala
--
diff --git
a/
[SPARK-3928] [SPARK-5182] [SQL] Partitioning support for the data sources API
This PR adds partitioning support for the external data sources API. It aims to
simplify development of file system based data sources, and provide first class
partitioning support for both read path and write path. E
Repository: spark
Updated Branches:
refs/heads/master 831504cf6 -> 0595b6de8
http://git-wip-us.apache.org/repos/asf/spark/blob/0595b6de/sql/core/src/main/scala/org/apache/spark/sql/sources/interfaces.scala
--
diff --git
a/sql/
[SPARK-3928] [SPARK-5182] [SQL] Partitioning support for the data sources API
This PR adds partitioning support for the external data sources API. It aims to
simplify development of file system based data sources, and provide first class
partitioning support for both read path and write path. E
Repository: spark
Updated Branches:
refs/heads/branch-1.4 653db0a1b -> a9d84a9bf
[DataFrame][minor] cleanup unapply methods in DataTypes
Author: Wenchen Fan
Closes #6079 from cloud-fan/unapply and squashes the following commits:
40da442 [Wenchen Fan] one more
7d90a05 [Wenchen Fan] cleanup u
Repository: spark
Updated Branches:
refs/heads/master d86ce8458 -> 831504cf6
[DataFrame][minor] cleanup unapply methods in DataTypes
Author: Wenchen Fan
Closes #6079 from cloud-fan/unapply and squashes the following commits:
40da442 [Wenchen Fan] one more
7d90a05 [Wenchen Fan] cleanup unapp
Repository: spark
Updated Branches:
refs/heads/branch-1.4 2bbb685f4 -> 653db0a1b
[SPARK-6876] [PySpark] [SQL] add DataFrame na.replace in pyspark
Author: Daoyuan Wang
Closes #6003 from adrian-wang/pynareplace and squashes the following commits:
672efba [Daoyuan Wang] remove py2.7 feature
4a
Repository: spark
Updated Branches:
refs/heads/master ec6f2a977 -> d86ce8458
[SPARK-6876] [PySpark] [SQL] add DataFrame na.replace in pyspark
Author: Daoyuan Wang
Closes #6003 from adrian-wang/pynareplace and squashes the following commits:
672efba [Daoyuan Wang] remove py2.7 feature
4a148f
Repository: spark
Updated Branches:
refs/heads/master f3e8e6006 -> ec6f2a977
[SPARK-7532] [STREAMING] StreamingContext.start() made to logWarning and not
throw exception
Author: Tathagata Das
Closes #6060 from tdas/SPARK-7532 and squashes the following commits:
6fe2e83 [Tathagata Das] Upda
Repository: spark
Updated Branches:
refs/heads/branch-1.4 56016326c -> 2bbb685f4
[SPARK-7532] [STREAMING] StreamingContext.start() made to logWarning and not
throw exception
Author: Tathagata Das
Closes #6060 from tdas/SPARK-7532 and squashes the following commits:
6fe2e83 [Tathagata Das]
Repository: spark
Updated Branches:
refs/heads/branch-1.4 afe54b76a -> 56016326c
[SPARK-7467] Dag visualization: treat checkpoint as an RDD operation
Such that a checkpoint RDD does not go into random scopes on the UI, e.g.
`take`. We've seen this in streaming.
Author: Andrew Or
Closes #60
Repository: spark
Updated Branches:
refs/heads/master 82e890fb1 -> f3e8e6006
[SPARK-7467] Dag visualization: treat checkpoint as an RDD operation
Such that a checkpoint RDD does not go into random scopes on the UI, e.g.
`take`. We've seen this in streaming.
Author: Andrew Or
Closes #6004 f
Repository: spark
Updated Branches:
refs/heads/branch-1.4 4092a2e85 -> afe54b76a
[SPARK-7485] [BUILD] Remove pyspark files from assembly.
The sbt part of the build is hacky; it basically tricks sbt
into generating the zip by using a generator, but returns
an empty list for the generated files
Repository: spark
Updated Branches:
refs/heads/master 984787526 -> 82e890fb1
[SPARK-7485] [BUILD] Remove pyspark files from assembly.
The sbt part of the build is hacky; it basically tricks sbt
into generating the zip by using a generator, but returns
an empty list for the generated files so t
Repository: spark
Updated Branches:
refs/heads/branch-1.4 af374ed26 -> 4092a2e85
[MINOR] [PYSPARK] Set PYTHONPATH to python/lib/pyspark.zip rather than
python/pyspark
As PR #5580 we have created pyspark.zip on building and set PYTHONPATH to
python/lib/pyspark.zip, so to keep consistence upda
Repository: spark
Updated Branches:
refs/heads/master 8a4edecc4 -> 984787526
[MINOR] [PYSPARK] Set PYTHONPATH to python/lib/pyspark.zip rather than
python/pyspark
As PR #5580 we have created pyspark.zip on building and set PYTHONPATH to
python/lib/pyspark.zip, so to keep consistence update t
Repository: spark
Updated Branches:
refs/heads/branch-1.4 6523fb81b -> af374ed26
[SPARK-7534] [CORE] [WEBUI] Fix the Stage table when a stage is missing
Just improved the Stage table when a stage is missing.
Before:
![screen shot 2015-05-11 at 10 11 51
am](https://cloud.githubusercontent.co
Repository: spark
Updated Branches:
refs/heads/master 640f63b95 -> 8a4edecc4
[SPARK-7534] [CORE] [WEBUI] Fix the Stage table when a stage is missing
Just improved the Stage table when a stage is missing.
Before:
![screen shot 2015-05-11 at 10 11 51
am](https://cloud.githubusercontent.com/as
77 matches
Mail list logo