Github user frreiss commented on a diff in the pull request:
https://github.com/apache/spark/pull/15307#discussion_r81668841
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StatefulAggregate.scala
---
@@ -56,7 +57,12 @@ case class StateStoreRestoreExec
Github user frreiss commented on a diff in the pull request:
https://github.com/apache/spark/pull/15307#discussion_r81672040
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala
---
@@ -136,16 +139,30 @@ class StreamExecution
Github user frreiss commented on a diff in the pull request:
https://github.com/apache/spark/pull/15307#discussion_r81672432
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala
---
@@ -317,15 +358,18 @@ class StreamExecution
Github user frreiss commented on a diff in the pull request:
https://github.com/apache/spark/pull/15307#discussion_r81684775
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala
---
@@ -136,16 +139,30 @@ class StreamExecution
Github user frreiss commented on a diff in the pull request:
https://github.com/apache/spark/pull/15352#discussion_r82085912
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala
---
@@ -207,13 +207,18 @@ class StreamExecution
Github user frreiss commented on a diff in the pull request:
https://github.com/apache/spark/pull/14553#discussion_r83524216
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/Source.scala
---
@@ -30,16 +30,37 @@ trait Source {
/** Returns the
Github user frreiss commented on a diff in the pull request:
https://github.com/apache/spark/pull/14553#discussion_r83524491
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/socket.scala
---
@@ -92,21 +105,64 @@ class TextSocketSource(host: String, port
Github user frreiss commented on the issue:
https://github.com/apache/spark/pull/14553
Sorry, I missed the last few email notifications about this PR. I've merged
with the head version and made updates to address the most recent round of
review comments. Currently running regre
Github user frreiss commented on the issue:
https://github.com/apache/spark/pull/14553
All my changes are in now, and regression tests pass. As far as I can see,
all the review comments have been addressed at this point.
---
If your project is set up for it, you can reply to this
Github user frreiss commented on a diff in the pull request:
https://github.com/apache/spark/pull/14553#discussion_r84138507
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/Source.scala
---
@@ -30,16 +30,30 @@ trait Source {
/** Returns the
Github user frreiss commented on the issue:
https://github.com/apache/spark/pull/14553
I've been running tests since this morning; should have updates in soon.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If
Github user frreiss commented on a diff in the pull request:
https://github.com/apache/spark/pull/14553#discussion_r84538526
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala
---
@@ -336,17 +342,27 @@ class StreamExecution
Github user frreiss commented on a diff in the pull request:
https://github.com/apache/spark/pull/14553#discussion_r84539662
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala
---
@@ -336,17 +342,27 @@ class StreamExecution
Github user frreiss commented on a diff in the pull request:
https://github.com/apache/spark/pull/14553#discussion_r84569335
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala
---
@@ -337,17 +343,27 @@ class StreamExecution
Github user frreiss commented on a diff in the pull request:
https://github.com/apache/spark/pull/14553#discussion_r85227658
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/memory.scala
---
@@ -111,6 +126,23 @@ case class MemoryStream[A : Encoder](id
Github user frreiss commented on a diff in the pull request:
https://github.com/apache/spark/pull/14553#discussion_r85227714
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/memory.scala
---
@@ -111,6 +126,23 @@ case class MemoryStream[A : Encoder](id
Github user frreiss commented on the issue:
https://github.com/apache/spark/pull/14553
Updated the branch and addressed new review comments. Looks like my last
push missed a one-line change to memory.scala. Tests are running now.
---
If your project is set up for it, you can reply
Github user frreiss closed the pull request at:
https://github.com/apache/spark/pull/15162
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is
Github user frreiss commented on the issue:
https://github.com/apache/spark/pull/15162
Closing the PR.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or
Github user frreiss commented on the issue:
https://github.com/apache/spark/pull/15027
When I comment out line 155 in HDFSMetadataLog.scala on this branch (`if
(fileManager.exists(crcPath)) fileManager.delete(crcPath)`) and run the test
case attached to this PR, the test case fails
Github user frreiss commented on the issue:
https://github.com/apache/spark/pull/15027
@viirya to answer your question re deleting vs moving the files: Deleting
is easier to implement, because once the .crc file is deleted, you can be sure
it won't appear again. Moving the che
Github user frreiss commented on a diff in the pull request:
https://github.com/apache/spark/pull/14553#discussion_r76498223
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/MetadataLog.scala
---
@@ -48,4 +49,13 @@ trait MetadataLog[T] {
* Return
Github user frreiss commented on a diff in the pull request:
https://github.com/apache/spark/pull/14553#discussion_r76498251
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala
---
@@ -244,6 +250,21 @@ class StreamExecution
Github user frreiss commented on a diff in the pull request:
https://github.com/apache/spark/pull/14553#discussion_r76498301
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/socket.scala
---
@@ -24,21 +24,24 @@ import java.text.SimpleDateFormat
Github user frreiss commented on a diff in the pull request:
https://github.com/apache/spark/pull/14553#discussion_r76498637
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/streaming/FileStreamSourceSuite.scala
---
@@ -727,6 +732,48 @@ class FileStreamSourceSuite extends
Github user frreiss commented on a diff in the pull request:
https://github.com/apache/spark/pull/13513#discussion_r76499068
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/FileStreamSource.scala
---
@@ -129,3 +131,86 @@ class FileStreamSource
Github user frreiss commented on the issue:
https://github.com/apache/spark/pull/14802
LGTM. I have written nearly the exact same thing as part of
[https://github.com/apache/spark/pull/14553], but can use this version of the
method instead.
---
If your project is set up for it, you
Github user frreiss commented on a diff in the pull request:
https://github.com/apache/spark/pull/14773#discussion_r76503740
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/command/createDataSourceTables.scala
---
@@ -65,7 +65,7 @@ case class
Github user frreiss commented on a diff in the pull request:
https://github.com/apache/spark/pull/14691#discussion_r76504239
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/streaming/DataStreamWriter.scala
---
@@ -123,12 +124,30 @@ final class DataStreamWriter[T] private
Github user frreiss commented on a diff in the pull request:
https://github.com/apache/spark/pull/14691#discussion_r76505064
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/streaming/DataStreamWriter.scala
---
@@ -123,12 +124,30 @@ final class DataStreamWriter[T] private
Github user frreiss commented on the issue:
https://github.com/apache/spark/pull/14553
@rxin and @marmbrus, would it be possible to get this PR reviewed soon? I
can split it into smaller chunks if that would make things easier; I just need
to know.
---
If your project is set up for
Github user frreiss commented on a diff in the pull request:
https://github.com/apache/spark/pull/14803#discussion_r76646983
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/FileStreamSource.scala
---
@@ -129,13 +129,20 @@ class FileStreamSource
GitHub user frreiss opened a pull request:
https://github.com/apache/spark/pull/14870
[SPARK-17303] Added spark-warehouse to dev/.rat-excludes
## What changes were proposed in this pull request?
Excludes the `spark-warehouse` directory from the Apache RAT checks that
src
Github user frreiss commented on the issue:
https://github.com/apache/spark/pull/14803
LGTM
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the
Github user frreiss commented on the issue:
https://github.com/apache/spark/pull/14553
@ScrapCodes, would you mind triggering a build of this PR?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
GitHub user frreiss opened a pull request:
https://github.com/apache/spark/pull/14945
[SPARK-17386] Set default trigger interval to 1/10 second
## What changes were proposed in this pull request?
This pull request implements the most expedient change to fix SPARK-17386
GitHub user frreiss opened a pull request:
https://github.com/apache/spark/pull/14986
[WIP] [SPARK-17421] Don't use -XX:MaxPermSize option when Java version >= 8
## What changes were proposed in this pull request?
Modifies the `build/mvn` and `build/sbt-launch-
Github user frreiss commented on the issue:
https://github.com/apache/spark/pull/14986
Make sense. I will close this PR and just add a clarification to the
documentation.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well
Github user frreiss closed the pull request at:
https://github.com/apache/spark/pull/14986
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is
GitHub user frreiss opened a pull request:
https://github.com/apache/spark/pull/15005
[SPARK-17421] Documenting the current treatment of MAVEN_OPTS.
## What changes were proposed in this pull request?
Modified the documentation to clarify that `build/mvn` and `pom.xml
Github user frreiss commented on the issue:
https://github.com/apache/spark/pull/14945
On a closer reading of the code, there is a more expedient fix; change the
default STREAMING_POLLING_DELAY parameter. Will redo.
---
If your project is set up for it, you can reply to this email
Github user frreiss closed the pull request at:
https://github.com/apache/spark/pull/14945
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is
GitHub user frreiss opened a pull request:
https://github.com/apache/spark/pull/15027
[SPARK-17475] [STREAMING] Delete CRC files if the filesystem doesn't use
checksum files
## What changes were proposed in this pull request?
When the metadata logs for various par
Github user frreiss commented on the issue:
https://github.com/apache/spark/pull/15005
Sure, I'll redo that part so that includes two sets of recommended options.
Note that docs in the Spark 2.0.0 release say that these options aren't
necessary for Java 8.
---
If your
Github user frreiss commented on a diff in the pull request:
https://github.com/apache/spark/pull/15027#discussion_r78469982
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/HDFSMetadataLog.scala
---
@@ -146,6 +146,11 @@ class HDFSMetadataLog[T: ClassTag
Github user frreiss commented on the issue:
https://github.com/apache/spark/pull/13513
You could just move the metadata deletion logic from FileStreamSinkLog into
CompactibleFileStreamLog. Then FileStreamSource could issue DELETE log records
for files that are older than
GitHub user frreiss opened a pull request:
https://github.com/apache/spark/pull/15067
[SPARK-17513] [STREAMING] [SQL] Make StreamExecution garbage-collect its
metadata
## What changes were proposed in this pull request?
This PR modifies StreamExecution such that it
Github user frreiss commented on the issue:
https://github.com/apache/spark/pull/13513
Ah, now I fully understand @zsxwing's earlier comment about the semantics
of the semantics of `Source.getBatch()`. Those semantics have a design flaw;
see the email thread I started at
Github user frreiss commented on the issue:
https://github.com/apache/spark/pull/15005
Quick update: I'm running a series of test builds with various parameters
to determine what parts of MAVEN_OPTS are currently necessary on different
versions of Java. Will report back in a few
Github user frreiss commented on the issue:
https://github.com/apache/spark/pull/15005
I've about narrowed down the options that work for OpenJDK 7 and 8 on Mac
and Linux. Working on IBM Java on Linux. I can have an update in by EOD today.
BTW, one thing that's been
Github user frreiss closed the pull request at:
https://github.com/apache/spark/pull/15067
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is
Github user frreiss commented on a diff in the pull request:
https://github.com/apache/spark/pull/15067#discussion_r79662093
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamingQuerySuite.scala
---
@@ -125,6 +125,32 @@ class StreamingQuerySuite extends
GitHub user frreiss opened a pull request:
https://github.com/apache/spark/pull/15162
[SPARK-17386] [STREAMING] [WIP] Make polling rate adaptive
## What changes were proposed in this pull request?
This change makes the scheduler in `StreamExecution` adjust its rate of
Github user frreiss commented on a diff in the pull request:
https://github.com/apache/spark/pull/15166#discussion_r79722643
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamingQuerySuite.scala
---
@@ -125,6 +125,30 @@ class StreamingQuerySuite extends
Github user frreiss commented on a diff in the pull request:
https://github.com/apache/spark/pull/15166#discussion_r79730664
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamingQuerySuite.scala
---
@@ -125,6 +125,30 @@ class StreamingQuerySuite extends
Github user frreiss commented on a diff in the pull request:
https://github.com/apache/spark/pull/15166#discussion_r79730904
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamingQuerySuite.scala
---
@@ -125,6 +125,30 @@ class StreamingQuerySuite extends
Github user frreiss commented on the issue:
https://github.com/apache/spark/pull/15005
Summary of testing:
- On Java 8, the build fails intermittently with OOM when `-Xmx2g` is
omitted
- The `-XX:ReservedCodeCacheSize=512m` argument prevents warnings on both
Java 7 and
Github user frreiss commented on a diff in the pull request:
https://github.com/apache/spark/pull/15005#discussion_r79765892
--- Diff: docs/building-spark.md ---
@@ -16,24 +16,31 @@ Building Spark using Maven requires Maven 3.3.9 or
newer and Java 7+.
### Setting up
Github user frreiss commented on a diff in the pull request:
https://github.com/apache/spark/pull/15005#discussion_r79766509
--- Diff: docs/building-spark.md ---
@@ -16,24 +16,27 @@ Building Spark using Maven requires Maven 3.3.9 or
newer and Java 7+.
### Setting up
Github user frreiss commented on a diff in the pull request:
https://github.com/apache/spark/pull/15005#discussion_r79902326
--- Diff: docs/building-spark.md ---
@@ -16,24 +16,32 @@ Building Spark using Maven requires Maven 3.3.9 or
newer and Java 7+.
### Setting up
Github user frreiss commented on the issue:
https://github.com/apache/spark/pull/15005
Thanks @srowen for all the thoughtful comments! It's great to see
committers spending time to help improve the build experience for new
developers.
---
If your project is set up for it, yo
Github user frreiss commented on a diff in the pull request:
https://github.com/apache/spark/pull/15262#discussion_r80826485
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/streaming/FileStreamSourceSuite.scala
---
@@ -330,15 +353,42 @@ class FileStreamSourceSuite extends
Github user frreiss commented on a diff in the pull request:
https://github.com/apache/spark/pull/15258#discussion_r80838376
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/ListingFileCatalog.scala
---
@@ -50,6 +50,19 @@ class ListingFileCatalog
Github user frreiss commented on the issue:
https://github.com/apache/spark/pull/15258
This change allows FileInputStream to consume partial outputs of a system
such as Hadoop or another copy of Spark, provided that the system adheres
rigidly to the write policy of recent versions of
Github user frreiss commented on the issue:
https://github.com/apache/spark/pull/15262
LGTM overall. We may want to switch more of the test cases to use HDFS in a
follow-on JIRA.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub
Github user frreiss commented on a diff in the pull request:
https://github.com/apache/spark/pull/13155#discussion_r66478261
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
---
@@ -1695,16 +1695,176 @@ object
Github user frreiss commented on the issue:
https://github.com/apache/spark/pull/13155
@rxin I'll have an updated set of changes in tonight
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user frreiss commented on a diff in the pull request:
https://github.com/apache/spark/pull/13155#discussion_r66509405
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
---
@@ -1695,16 +1695,176 @@ object
Github user frreiss commented on a diff in the pull request:
https://github.com/apache/spark/pull/13155#discussion_r66509444
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
---
@@ -1695,16 +1695,176 @@ object
Github user frreiss commented on a diff in the pull request:
https://github.com/apache/spark/pull/13155#discussion_r66509510
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
---
@@ -1695,16 +1695,176 @@ object
Github user frreiss commented on a diff in the pull request:
https://github.com/apache/spark/pull/13155#discussion_r66539170
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
---
@@ -1695,16 +1695,176 @@ object
Github user frreiss commented on a diff in the pull request:
https://github.com/apache/spark/pull/13155#discussion_r66558119
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
---
@@ -1695,16 +1695,176 @@ object
Github user frreiss commented on a diff in the pull request:
https://github.com/apache/spark/pull/13155#discussion_r66558125
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
---
@@ -1695,16 +1695,176 @@ object
Github user frreiss commented on a diff in the pull request:
https://github.com/apache/spark/pull/13155#discussion_r66560868
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
---
@@ -1695,16 +1695,176 @@ object
Github user frreiss commented on a diff in the pull request:
https://github.com/apache/spark/pull/13155#discussion_r66560947
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
---
@@ -1695,16 +1695,176 @@ object
Github user frreiss commented on a diff in the pull request:
https://github.com/apache/spark/pull/13155#discussion_r66561017
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
---
@@ -1695,16 +1695,176 @@ object
Github user frreiss commented on a diff in the pull request:
https://github.com/apache/spark/pull/13155#discussion_r66561815
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
---
@@ -1695,16 +1695,176 @@ object
Github user frreiss commented on a diff in the pull request:
https://github.com/apache/spark/pull/13155#discussion_r66564793
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
---
@@ -1695,16 +1695,176 @@ object
Github user frreiss commented on the issue:
https://github.com/apache/spark/pull/13155
Updated changes are in. Running a full regression suite overnight.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project
Github user frreiss commented on the issue:
https://github.com/apache/spark/pull/13155
Tests ran successfully on my machine.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user frreiss commented on the pull request:
https://github.com/apache/spark/pull/13155#issuecomment-221336991
Could one of the committers please trigger another build on this PR? The
change set passes all the tests on my machine, but it's good to be safe.
---
If your pr
Github user frreiss commented on a diff in the pull request:
https://github.com/apache/spark/pull/13155#discussion_r64941953
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
---
@@ -1695,16 +1695,176 @@ object
Github user frreiss commented on a diff in the pull request:
https://github.com/apache/spark/pull/13155#discussion_r64942480
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
---
@@ -1695,16 +1695,176 @@ object
Github user frreiss commented on a diff in the pull request:
https://github.com/apache/spark/pull/13155#discussion_r64942870
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
---
@@ -1695,16 +1695,176 @@ object
Github user frreiss commented on a diff in the pull request:
https://github.com/apache/spark/pull/13155#discussion_r64943404
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
---
@@ -1695,16 +1695,176 @@ object
Github user frreiss commented on a diff in the pull request:
https://github.com/apache/spark/pull/13155#discussion_r64944724
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
---
@@ -1695,16 +1695,176 @@ object
Github user frreiss commented on a diff in the pull request:
https://github.com/apache/spark/pull/13155#discussion_r64995985
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
---
@@ -1695,16 +1695,176 @@ object
Github user frreiss commented on a diff in the pull request:
https://github.com/apache/spark/pull/13155#discussion_r64996012
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
---
@@ -1695,16 +1695,176 @@ object
Github user frreiss commented on the pull request:
https://github.com/apache/spark/pull/13155
Thanks @hvanhovell for the additional pass of review! I'll be preparing my
slides for Spark Summit all day today but will come back to this PR as soon as
that's done.
---
If yo
Github user frreiss commented on a diff in the pull request:
https://github.com/apache/spark/pull/13155#discussion_r65454461
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
---
@@ -1695,16 +1695,176 @@ object
Github user frreiss commented on a diff in the pull request:
https://github.com/apache/spark/pull/13155#discussion_r65583548
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
---
@@ -1695,16 +1695,176 @@ object
Github user frreiss commented on a diff in the pull request:
https://github.com/apache/spark/pull/13155#discussion_r65584546
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
---
@@ -1695,16 +1695,176 @@ object
Github user frreiss commented on a diff in the pull request:
https://github.com/apache/spark/pull/12574#discussion_r60958325
--- Diff: mllib/src/main/scala/org/apache/spark/ml/recommendation/ALS.scala
---
@@ -218,11 +292,135 @@ class ALSModel private[ml] (
predict
Github user frreiss commented on a diff in the pull request:
https://github.com/apache/spark/pull/12574#discussion_r60958628
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/recommendation/MatrixFactorizationModel.scala
---
@@ -261,58 +261,93 @@ object
Github user frreiss commented on the pull request:
https://github.com/apache/spark/pull/12574#issuecomment-214463888
LGTM
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
GitHub user frreiss opened a pull request:
https://github.com/apache/spark/pull/13155
[SPARK-15370] [SQL] Update RewriteCorrelatedScalarSubquery rule to fix
COUNT bug
## What changes were proposed in this pull request?
This pull request fixes the COUNT bug in the
Github user frreiss commented on a diff in the pull request:
https://github.com/apache/spark/pull/13155#discussion_r63756577
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
---
@@ -1648,16 +1648,56 @@ object
Github user frreiss commented on a diff in the pull request:
https://github.com/apache/spark/pull/13155#discussion_r63757089
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
---
@@ -1648,16 +1648,56 @@ object
Github user frreiss commented on a diff in the pull request:
https://github.com/apache/spark/pull/13155#discussion_r63759450
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
---
@@ -1648,16 +1648,56 @@ object
Github user frreiss commented on a diff in the pull request:
https://github.com/apache/spark/pull/13155#discussion_r63767862
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/SubquerySuite.scala
---
@@ -293,4 +293,65 @@ class SubquerySuite extends QueryTest with
1 - 100 of 143 matches
Mail list logo