Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/1447#discussion_r15037134
--- Diff: core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala
---
@@ -571,12 +571,7 @@ class PairRDDFunctions[K, V](self: RDD[(K, V
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/1447#discussion_r15037164
--- Diff: core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala
---
@@ -712,8 +701,8 @@ class PairRDDFunctions[K, V](self: RDD[(K, V
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/1450#discussion_r15038311
--- Diff: core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala
---
@@ -361,11 +361,11 @@ class PairRDDFunctions[K, V](self: RDD[(K, V
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/1447#discussion_r15038340
--- Diff: core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala
---
@@ -216,17 +216,17 @@ class PairRDDFunctions[K, V](self: RDD[(K, V
Github user rxin commented on the pull request:
https://github.com/apache/spark/pull/1450#issuecomment-49251632
Pushed a new version.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/1450#discussion_r15040336
--- Diff: core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala
---
@@ -214,7 +214,7 @@ class PairRDDFunctions[K, V](self: RDD[(K, V
Github user rxin commented on the pull request:
https://github.com/apache/spark/pull/1452#issuecomment-49259394
That was actually my main concern from the beginning with this change. From
my initial observation everything does seem work. I intentionally avoided
keeping references
Github user rxin commented on the pull request:
https://github.com/apache/spark/pull/1452#issuecomment-49259482
Yes - actions were intentionally not broadcast for now. It makes it more
complicated ... let's do that in a separate PR.
---
If your project is set up for it, you can
Github user rxin commented on the pull request:
https://github.com/apache/spark/pull/1450#issuecomment-49261722
Eh the binary checker is really failing me. Is there a way to disable
binary checker for inner functions? @pwendell
---
If your project is set up for it, you can
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/1447#discussion_r15042897
--- Diff: core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala
---
@@ -216,17 +216,17 @@ class PairRDDFunctions[K, V](self: RDD[(K, V
Github user rxin commented on the pull request:
https://github.com/apache/spark/pull/1262#issuecomment-49263949
I pushed a new version. I'd first merge this and then have a separate PR to
index the hash table by stageId + attempt.
Now it includes @kayousterhout's change
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/1450#discussion_r15043414
--- Diff: core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala
---
@@ -214,7 +214,7 @@ class PairRDDFunctions[K, V](self: RDD[(K, V
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/1447#discussion_r15044062
--- Diff: core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala
---
@@ -712,8 +701,8 @@ class PairRDDFunctions[K, V](self: RDD[(K, V
Github user rxin commented on the pull request:
https://github.com/apache/spark/pull/1450#issuecomment-49350307
Merged in master.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
GitHub user rxin opened a pull request:
https://github.com/apache/spark/pull/1469
[SPARK-2534] Avoid pulling in the entire RDD in various operators
(branch-1.0 backport)
This backports #1450 into branch-1.0.
You can merge this pull request into a Git repository by running
Github user rxin commented on the pull request:
https://github.com/apache/spark/pull/1469#issuecomment-49373136
Jenkins, retest this please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user rxin commented on the pull request:
https://github.com/apache/spark/pull/1469#issuecomment-49379863
Merging in master.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user rxin closed the pull request at:
https://github.com/apache/spark/pull/1469
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
Github user rxin commented on the pull request:
https://github.com/apache/spark/pull/1262#issuecomment-49387996
Merging in master. Thanks for reviewing.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project
GitHub user rxin opened a pull request:
https://github.com/apache/spark/pull/1478
Reservoir sampling implementation.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/rxin/spark reservoirSample
Alternatively you can review
Github user rxin commented on the pull request:
https://github.com/apache/spark/pull/1477#issuecomment-49397603
Jenkins, test this please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user rxin commented on the pull request:
https://github.com/apache/spark/pull/1475#issuecomment-49399326
Merging. Thanks.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/1473#discussion_r15098078
--- Diff: core/src/main/scala/org/apache/spark/Partitioner.scala ---
@@ -135,7 +135,7 @@ class RangePartitioner[K : Ordering : ClassTag, V](
val k
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/1473#discussion_r15098185
--- Diff: core/src/main/scala/org/apache/spark/Partitioner.scala ---
@@ -135,7 +135,7 @@ class RangePartitioner[K : Ordering : ClassTag, V](
val k
Github user rxin commented on the pull request:
https://github.com/apache/spark/pull/1479#issuecomment-49404215
Merging this ...
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user rxin commented on the pull request:
https://github.com/apache/spark/pull/1478#issuecomment-49404318
Jenkins, retest this please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user rxin commented on the pull request:
https://github.com/apache/spark/pull/1478#issuecomment-49471598
Merging in master.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user rxin commented on the pull request:
https://github.com/apache/spark/pull/1477#issuecomment-49501450
Jenkins, test this please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user rxin commented on the pull request:
https://github.com/apache/spark/pull/1452#issuecomment-49501642
Thanks for taking a look. I'm merging this one as is, and will submit a
small PR to fix the issues.
---
If your project is set up for it, you can reply to this email
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/1452#discussion_r15142372
--- Diff: core/src/main/scala/org/apache/spark/scheduler/ResultTask.scala
---
@@ -17,134 +17,68 @@
package org.apache.spark.scheduler
Github user rxin commented on the pull request:
https://github.com/apache/spark/pull/1477#issuecomment-49503209
Thanks. Merging in master.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/1490#discussion_r15145612
--- Diff:
core/src/main/scala/org/apache/spark/network/MessageChunkHeader.scala ---
@@ -41,6 +42,13 @@ private[spark] class MessageChunkHeader
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/1490#discussion_r15145614
--- Diff:
core/src/main/scala/org/apache/spark/network/MessageChunkHeader.scala ---
@@ -67,13 +75,20 @@ private[spark] object MessageChunkHeader {
val
Github user rxin commented on the pull request:
https://github.com/apache/spark/pull/1452#issuecomment-49532568
Apparently this broke the build. Reverting and will work on a fix.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub
GitHub user rxin opened a pull request:
https://github.com/apache/spark/pull/1498
[SPARK-2521] Broadcast RDD object (instead of sending it along with every
task)
This is a resubmission of #1452. It was reverted because it broke the build.
Currently (as of Spark 1.0.1
Github user rxin commented on the pull request:
https://github.com/apache/spark/pull/1473#issuecomment-49539328
I filed a JIRA: https://issues.apache.org/jira/browse/SPARK-2598
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub
GitHub user rxin opened a pull request:
https://github.com/apache/spark/pull/1500
[SPARK-2598] RangePartitioner's binary search does not use the given
Ordering
We should fix this in branch-1.0 as well.
You can merge this pull request into a Git repository by running:
$ git
Github user rxin commented on the pull request:
https://github.com/apache/spark/pull/1473#issuecomment-49539660
@dorx can you close this PR? #1500 includes the change here.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/1497#discussion_r15147996
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
---
@@ -152,6 +155,37 @@ class Analyzer(catalog: Catalog, registry
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/1497#discussion_r15148000
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
---
@@ -152,6 +155,37 @@ class Analyzer(catalog: Catalog, registry
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/1497#discussion_r15148004
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
---
@@ -152,6 +155,37 @@ class Analyzer(catalog: Catalog, registry
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/1497#discussion_r15148009
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
---
@@ -152,6 +155,37 @@ class Analyzer(catalog: Catalog, registry
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/1497#discussion_r15148014
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
---
@@ -152,6 +155,37 @@ class Analyzer(catalog: Catalog, registry
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/1499#discussion_r15148079
--- Diff:
core/src/main/scala/org/apache/spark/util/collection/ExternalSorter.scala ---
@@ -0,0 +1,390 @@
+/*
+ * Licensed to the Apache Software
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/1499#discussion_r15148086
--- Diff:
core/src/main/scala/org/apache/spark/util/collection/ExternalSorter.scala ---
@@ -0,0 +1,390 @@
+/*
+ * Licensed to the Apache Software
Github user rxin commented on the pull request:
https://github.com/apache/spark/pull/1493#issuecomment-49540207
Jenkins, retest this please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user rxin commented on the pull request:
https://github.com/apache/spark/pull/1492#issuecomment-49540212
Jenkins, retest this please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/1465#discussion_r15148111
--- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala ---
@@ -1463,12 +1463,13 @@ object SparkContext extends Logging {
// Regular
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/1465#discussion_r15148112
--- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala ---
@@ -1477,7 +1478,8 @@ object SparkContext extends Logging {
def localCpuCount
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/1465#discussion_r15148114
--- Diff: docs/configuration.md ---
@@ -599,6 +599,15 @@ Apart from these, the following properties are also
available, and may be useful
td
Github user rxin commented on the pull request:
https://github.com/apache/spark/pull/1447#issuecomment-49540356
Merging this in master.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user rxin commented on the pull request:
https://github.com/apache/spark/pull/1418#issuecomment-49540404
Thanks for submitting this. I think we can still stack overflow in
serialization, but I agree it's better to do this non-recursivley.
---
If your project is set up
Github user rxin commented on the pull request:
https://github.com/apache/spark/pull/1418#issuecomment-49540416
Actually it's late. I will review this tomorrow.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user rxin commented on the pull request:
https://github.com/apache/spark/pull/1387#issuecomment-49540472
I've talked to many JVM developers (engineers who work on the JVM) and
while System.gc is advisory in the spec, it is actually a pretty reliable way
of triggering GC
Github user rxin commented on the pull request:
https://github.com/apache/spark/pull/1500#issuecomment-49554380
Merged in master branch-1.0.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user rxin commented on the pull request:
https://github.com/apache/spark/pull/1492#issuecomment-49557834
Merging in master.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user rxin commented on the pull request:
https://github.com/apache/spark/pull/1502#issuecomment-49559170
Cool. What about P^3 sort? :)
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user rxin commented on the pull request:
https://github.com/apache/spark/pull/1500#issuecomment-49565928
0.9.x doesn't have this problem because there was no binary search.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub
Github user rxin commented on the pull request:
https://github.com/apache/spark/pull/1498#issuecomment-49571549
Jenkins, retest this please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user rxin commented on the pull request:
https://github.com/apache/spark/pull/1505#issuecomment-49572014
@davies can you take a look?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/1497#discussion_r15155562
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
---
@@ -152,6 +155,34 @@ class Analyzer(catalog: Catalog, registry
Github user rxin commented on the pull request:
https://github.com/apache/spark/pull/1502#issuecomment-49575156
He did it!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user rxin commented on the pull request:
https://github.com/apache/spark/pull/113#issuecomment-37904304
We are reverting this pull request in #167
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project
Github user rxin commented on the pull request:
https://github.com/apache/spark/pull/167#issuecomment-37906590
Ok I merged this.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/172#discussion_r10718417
--- Diff: core/src/main/scala/org/apache/spark/util/collection/BitSet.scala
---
@@ -88,6 +88,53 @@ class BitSet(numBits: Int) extends Serializable
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/172#discussion_r10718405
--- Diff: core/src/main/scala/org/apache/spark/util/collection/BitSet.scala
---
@@ -88,6 +88,53 @@ class BitSet(numBits: Int) extends Serializable
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/172#discussion_r10718473
--- Diff: core/src/main/scala/org/apache/spark/util/collection/BitSet.scala
---
@@ -88,6 +88,53 @@ class BitSet(numBits: Int) extends Serializable
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/172#discussion_r10718582
--- Diff: core/src/main/scala/org/apache/spark/util/collection/BitSet.scala
---
@@ -88,6 +88,53 @@ class BitSet(numBits: Int) extends Serializable
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/172#discussion_r10718634
--- Diff:
core/src/test/scala/org/apache/spark/util/collection/BitSetSuite.scala ---
@@ -69,4 +69,45 @@ class BitSetSuite extends FunSuite {
assert
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/172#discussion_r10718623
--- Diff:
core/src/test/scala/org/apache/spark/util/collection/BitSetSuite.scala ---
@@ -69,4 +69,45 @@ class BitSetSuite extends FunSuite {
assert
Github user rxin commented on the pull request:
https://github.com/apache/spark/pull/172#issuecomment-37970799
Hi @petko-nikolov,
Thanks a lot for contributing this patch! I left some comments to help the
code conform to Spark coding style, and on test coverage. It would
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/88#discussion_r10738878
--- Diff: mllib/src/main/scala/org/apache/spark/mllib/linalg/PCA.scala ---
@@ -0,0 +1,129 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/88#discussion_r10738906
--- Diff: mllib/src/main/scala/org/apache/spark/mllib/linalg/PCA.scala ---
@@ -0,0 +1,129 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/88#discussion_r10738948
--- Diff: mllib/src/main/scala/org/apache/spark/mllib/linalg/PCA.scala ---
@@ -0,0 +1,129 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/88#discussion_r10738993
--- Diff: mllib/src/main/scala/org/apache/spark/mllib/util/LAUtils.scala ---
@@ -0,0 +1,67 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/88#discussion_r10739033
--- Diff: mllib/src/main/scala/org/apache/spark/mllib/util/LAUtils.scala ---
@@ -0,0 +1,67 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/88#discussion_r10739070
--- Diff: mllib/src/test/scala/org/apache/spark/mllib/linalg/PCASuite.scala
---
@@ -0,0 +1,125 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/88#discussion_r10739083
--- Diff: mllib/src/test/scala/org/apache/spark/mllib/linalg/PCASuite.scala
---
@@ -0,0 +1,125 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/88#discussion_r10739114
--- Diff: mllib/src/main/scala/org/apache/spark/mllib/linalg/SVD.scala ---
@@ -142,17 +172,189 @@ object SVD {
val vsirdd = sc.makeRDD(Array.tabulate
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/88#discussion_r10739138
--- Diff: mllib/src/main/scala/org/apache/spark/mllib/linalg/SVD.scala ---
@@ -38,18 +40,49 @@ class SVD
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/88#discussion_r10739162
--- Diff: mllib/src/main/scala/org/apache/spark/mllib/linalg/SVD.scala ---
@@ -142,17 +172,189 @@ object SVD {
val vsirdd = sc.makeRDD(Array.tabulate
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/88#discussion_r10739195
--- Diff: mllib/src/main/scala/org/apache/spark/mllib/linalg/SVD.scala ---
@@ -142,17 +172,189 @@ object SVD {
val vsirdd = sc.makeRDD(Array.tabulate
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/88#discussion_r10739236
--- Diff: mllib/src/main/scala/org/apache/spark/mllib/linalg/SVD.scala ---
@@ -142,17 +172,189 @@ object SVD {
val vsirdd = sc.makeRDD(Array.tabulate
Github user rxin commented on the pull request:
https://github.com/apache/spark/pull/88#issuecomment-38024310
Hi @rezazadeh
Thanks for working on this! I can't wait for this to be merged and improve
the coverage on common ml algorithms in mllib.
I am not really
Github user rxin commented on the pull request:
https://github.com/apache/spark/pull/88#issuecomment-38024368
Oh and I didn't go through all files for styles and readability. I'm sure
you can look at the rest and figure them out yourself. Thanks!
---
If your project is set up
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/88#discussion_r10739371
--- Diff: mllib/src/main/scala/org/apache/spark/mllib/linalg/PCA.scala ---
@@ -0,0 +1,129 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/88#discussion_r10739380
--- Diff: mllib/src/main/scala/org/apache/spark/mllib/linalg/PCA.scala ---
@@ -0,0 +1,129 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under
Github user rxin commented on the pull request:
https://github.com/apache/spark/pull/170#issuecomment-38028082
Thanks. I merged this in master branch-0.9 (fyi @pwendell)
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well
Github user rxin commented on the pull request:
https://github.com/apache/spark/pull/193#issuecomment-38336273
Thanks. I've merged this.,
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user rxin commented on the pull request:
https://github.com/apache/spark/pull/200#issuecomment-38336303
Thanks. Merged.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user rxin commented on the pull request:
https://github.com/apache/spark/pull/208#issuecomment-38415798
```scala
package org.apache.spark.sql
package catalyst
```
vs
```scala
package org.apache.spark.sql.catalyst
```
There are three reasons I
GitHub user rxin opened a pull request:
https://github.com/apache/spark/pull/229
Use Guava's top k implementation rather than our BoundedPriorityQueue based
implementation
Also updated the documentation for top and takeOrdered.
On my simple test of sorting 100 million (Int
GitHub user rxin opened a pull request:
https://github.com/apache/spark/pull/233
StopAfter / TopK related changes
1. Renamed StopAfter to Limit to be more consistent with naming in other
relational databases.
2. Renamed TopK to TakeOrdered to be more consistent with Spark RDD
Github user rxin commented on the pull request:
https://github.com/apache/spark/pull/233#issuecomment-38648471
Jenkins, retest this please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user rxin commented on the pull request:
https://github.com/apache/spark/pull/229#issuecomment-38648982
weird i missed that. fixed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user rxin commented on the pull request:
https://github.com/apache/spark/pull/233#issuecomment-38653186
Jenkins, retest this please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/164#discussion_r10965746
--- Diff:
mllib/src/main/java/org/apache/spark/mllib/input/WholeTextFileRecordReader.java
---
@@ -0,0 +1,103 @@
+/*
+ * Licensed to the Apache Software
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/164#discussion_r10965749
--- Diff:
mllib/src/main/java/org/apache/spark/mllib/input/WholeTextFileRecordReader.java
---
@@ -0,0 +1,103 @@
+/*
+ * Licensed to the Apache Software
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/164#discussion_r10965782
--- Diff: project/SparkBuild.scala ---
@@ -358,7 +358,7 @@ object SparkBuild extends Build {
def mllibSettings = sharedSettings ++ Seq(
name
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/164#discussion_r10965793
--- Diff: mllib/src/main/scala/org/apache/spark/mllib/MLContext.scala ---
@@ -0,0 +1,55 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under
1201 - 1300 of 14826 matches
Mail list logo