GitHub user gatorsmile opened a pull request:
https://github.com/apache/spark/pull/9216
[SPARK-8658] [SQL] AttributeReference's equals method compares all the
members
This fix is to change the equals method to check all of the specified
fields for equality of AttributeReference
Github user gatorsmile commented on the pull request:
https://github.com/apache/spark/pull/9216#issuecomment-150395202
My code change expose a new defect:
Both rollup and cube are not working correctly no matter whether the
build include my changes or not.
Without my
GitHub user gatorsmile opened a pull request:
https://github.com/apache/spark/pull/9314
[SPARK-11360] [Doc] Loss of nullability when writing parquet files
This fix is to add one line to explain the current behavior of Spark SQL
when writing Parquet files. All columns are forced
Github user gatorsmile commented on the pull request:
https://github.com/apache/spark/pull/9216#issuecomment-150475832
Hi, @cloud-fan
Sure. Will do. I am trying to see if I can easily fix it. Anyway, I will
open a JIRA tonight.
Thanks,
Xiao Li
Github user gatorsmile commented on the pull request:
https://github.com/apache/spark/pull/9216#issuecomment-150486350
The JIRA is opened:
https://issues.apache.org/jira/browse/SPARK-11275
I will continue the investigation on this JIRA issue.
---
If your project is set up
Github user gatorsmile commented on the pull request:
https://github.com/apache/spark/pull/9548#issuecomment-155835390
@cloud-fan Before discussing the solution details, let us first talk about
the design issues.
IMO, the `DataFrame` is a query language, kind of a dialect
Github user gatorsmile closed the pull request at:
https://github.com/apache/spark/pull/9385
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
GitHub user gatorsmile reopened a pull request:
https://github.com/apache/spark/pull/9385
[SPARK-11433] [SQL] Cleanup the subquery name after eliminating subquery
This fix is to remove the subquery name in qualifiers after eliminating
subquery.
You can merge this pull request
Github user gatorsmile commented on the pull request:
https://github.com/apache/spark/pull/9385#issuecomment-155605985
@marmbrus
After rechecking the root reason why Expand failed, I still think we should
cleanup the subquery name after subquery elimination. My current fix
GitHub user gatorsmile opened a pull request:
https://github.com/apache/spark/pull/9548
[SPARK-10838][SPARK-11576][SQL][WIP] Incorrect results or exceptions when
using self-joins
When resolving the attributeReference's ambiguity caused by self joins, the
current solution only
Github user gatorsmile commented on the pull request:
https://github.com/apache/spark/pull/9548#issuecomment-154881441
Since this solution requires adding quantifier comparison into the equation
of attributeReferences, this will fail a couple test cases in expand.
We have
Github user gatorsmile commented on the pull request:
https://github.com/apache/spark/pull/9548#issuecomment-155183305
I can't fix the problem without a major code change. The current design of
dataFrame has a fundamental problem. When using column references, we might hit
various
Github user gatorsmile commented on the pull request:
https://github.com/apache/spark/pull/9548#issuecomment-154911463
To fix these failed cases, I will move the dataFrame's hashCode to the
Column class, instead of directly putting the values to quantifiers.
---
If your project
Github user gatorsmile commented on the pull request:
https://github.com/apache/spark/pull/9314#issuecomment-155334571
Got it, thank you!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user gatorsmile commented on the pull request:
https://github.com/apache/spark/pull/9314#issuecomment-155309645
@marmbrus Should I reopen it? Thanks.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user gatorsmile commented on the pull request:
https://github.com/apache/spark/pull/9419#issuecomment-155973699
Thank you, Hao! Will do it in the next few days.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well
GitHub user gatorsmile opened a pull request:
https://github.com/apache/spark/pull/9683
[Spark-11637][SQL] Regression in UDF: exceptions when using Stars and Alias
When using UDF in Spark SQL, the query failed if star and alias are used at
the same time. This works in 1.4.x
Github user gatorsmile commented on the pull request:
https://github.com/apache/spark/pull/9683#issuecomment-156319117
The issue has been fixed in https://github.com/apache/spark/pull/9343.
I will close this PR.
---
If your project is set up for it, you can reply
Github user gatorsmile commented on the pull request:
https://github.com/apache/spark/pull/9385#issuecomment-156612181
Hi, @marmbrus
Originally, I thought quantifiers are part of identifiers, like schema name
in traditional RDBMS. Based on your explanation, this is not true
Github user gatorsmile commented on the pull request:
https://github.com/apache/spark/pull/9385#issuecomment-157239704
Sure. Close it. Thank you for your time!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user gatorsmile closed the pull request at:
https://github.com/apache/spark/pull/9385
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
Github user gatorsmile commented on a diff in the pull request:
https://github.com/apache/spark/pull/9216#discussion_r45011838
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/namedExpressions.scala
---
@@ -194,7 +194,9 @@ case class
Github user gatorsmile commented on the pull request:
https://github.com/apache/spark/pull/9717#issuecomment-156943032
Thanks!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user gatorsmile commented on the pull request:
https://github.com/apache/spark/pull/9419#issuecomment-155951403
Please let me know if I need to resolve these conflicts. @cloud-fan
@chenghao-intel @marmbrus @rxin
---
If your project is set up for it, you can reply
Github user gatorsmile commented on the pull request:
https://github.com/apache/spark/pull/9548#issuecomment-155912100
@cloud-fan So far, we do not have an easy fix, but I believe we should
never return a wrong result for self join.
Let me post the test case I added
Github user gatorsmile commented on the pull request:
https://github.com/apache/spark/pull/9761#issuecomment-157414057
@nongli I saw you have a related discussion with @chenghao-intel . The
failed test case was introduced in your PR
https://github.com/apache/spark/pull/9480. I am
Github user gatorsmile commented on the pull request:
https://github.com/apache/spark/pull/9761#issuecomment-157434770
Ok. I will also add three more lines for covering the new `hashCode` and
`equals` functions.
---
If your project is set up for it, you can reply to this email
Github user gatorsmile commented on the pull request:
https://github.com/apache/spark/pull/9081#issuecomment-157440817
@cloud-fan I am wondering if this will be merged soon?
I am not sure if I should fix a couple of self join issues before your
merge. Or I should not waste
Github user gatorsmile commented on the pull request:
https://github.com/apache/spark/pull/9385#issuecomment-15565
Hi, @marmbrus
After digging the root reason why Expand cases failed, I found we still
need a deeper clean of subquery after elimination.
Let me
Github user gatorsmile commented on the pull request:
https://github.com/apache/spark/pull/9548#issuecomment-155226523
@marmbrus Thank you for your suggestions!
That is also like my initial idea. I did a try last night. Unfortunately, I
hit a problem when adding such a field
Github user gatorsmile commented on the pull request:
https://github.com/apache/spark/pull/9419#issuecomment-153148409
@hvanhovell Your understanding is right.
If we merge both grouping and aggregation together, it will introduce extra
complexity to generate the logical plan
Github user gatorsmile commented on the pull request:
https://github.com/apache/spark/pull/9419#issuecomment-153146798
@holdenk This is the PR I mentioned in the email. Could you review it too?
---
If your project is set up for it, you can reply to this email and have your
reply
GitHub user gatorsmile opened a pull request:
https://github.com/apache/spark/pull/9419
[SPARK-11275][SQL][WIP] Rollup and Cube Generates the Incorrect Results
when Aggregation Functions Use Group By Columns
In the current implementation, Rollup and Cube are unable to generate
Github user gatorsmile commented on the pull request:
https://github.com/apache/spark/pull/9419#issuecomment-153145967
Hi, Rick,
1) This is a defect identified by me. It blocks my PR. It was introduced in
the initial implementation. Thus, it is not a regression.
2
Github user gatorsmile commented on a diff in the pull request:
https://github.com/apache/spark/pull/9385#discussion_r43559826
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
---
@@ -1019,7 +1019,16 @@ class Analyzer(
* scoping
Github user gatorsmile commented on the pull request:
https://github.com/apache/spark/pull/9314#issuecomment-152656067
@marmbrus : as you suggested, I submitted the pull request. Could you
review it?
Thanks,
Xiao Li
---
If your project is set up for it, you can
GitHub user gatorsmile opened a pull request:
https://github.com/apache/spark/pull/9385
[SPARK-11433] [SQL] Cleanup the subquery name after eliminating subquery
This fix is to remove the subquery name in qualifiers after eliminating
subquery.
You can merge this pull request
Github user gatorsmile commented on the pull request:
https://github.com/apache/spark/pull/9385#issuecomment-152708351
So far, I just observed this strange ghosting values when I read the
optimized logical tree, but my query did not trigger any issue.
Based on my
Github user gatorsmile commented on the pull request:
https://github.com/apache/spark/pull/9055#issuecomment-153857289
@jameszhouyi
We hit the same issue. Now, we bypass it by using joins.
---
If your project is set up for it, you can reply to this email and have your
reply
Github user gatorsmile commented on the pull request:
https://github.com/apache/spark/pull/9055#issuecomment-153920042
@jameszhouyi
Agree. This is an important feature for any SQL engine. We are also waiting
for this feature. So far, using joins is an alternative to bypass
Github user gatorsmile commented on the pull request:
https://github.com/apache/spark/pull/9419#issuecomment-153451771
@chenghao-intel @hvanhovell Unit test cases are added. Will finish the code
for resolving the comments by @holdenk @rick-ibm
@rxin @marmbrus @liancheng
Github user gatorsmile commented on a diff in the pull request:
https://github.com/apache/spark/pull/9385#discussion_r43817697
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
---
@@ -1019,7 +1019,16 @@ class Analyzer(
* scoping
Github user gatorsmile commented on the pull request:
https://github.com/apache/spark/pull/5919#issuecomment-154612297
@rxin @marmbrus This fix is unable to resolve the condition ambiguity for
nested self join. I also found the self joins could generate incorrect results
Github user gatorsmile commented on the pull request:
https://github.com/apache/spark/pull/9385#issuecomment-154650945
@marmbrus Thanks!
I will try to change equals to semanticEquals in the pull request
https://github.com/apache/spark/pull/9216. Then, you can decide
Github user gatorsmile commented on the pull request:
https://github.com/apache/spark/pull/9385#issuecomment-154609690
@marmbrus I already hit this issue when resolving
https://issues.apache.org/jira/browse/SPARK-8658. That means, when comparing
two AttributeReferences, we should
Github user gatorsmile commented on the pull request:
https://github.com/apache/spark/pull/9385#issuecomment-153529973
@cloud-fan @dbtsai , Jenkins did not start the testing. Could you let
Jenkins to test it?
Thank you!
---
If your project is set up for it, you can reply
Github user gatorsmile commented on a diff in the pull request:
https://github.com/apache/spark/pull/9385#discussion_r43826123
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
---
@@ -1019,7 +1019,16 @@ class Analyzer(
* scoping
Github user gatorsmile commented on the pull request:
https://github.com/apache/spark/pull/9216#issuecomment-153511090
@JoshRosen @cloud-fan
I submitted a pull request for JIRA Spark-11275:
https://github.com/apache/spark/pull/9419
Hopefully, after finishing the problem
Github user gatorsmile commented on the pull request:
https://github.com/apache/spark/pull/9385#issuecomment-153620226
@dbtsai Thank you!
Please let me know if you need any extra code change.
---
If your project is set up for it, you can reply to this email and have your
Github user gatorsmile commented on a diff in the pull request:
https://github.com/apache/spark/pull/9419#discussion_r43850164
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
---
@@ -232,7 +232,7 @@ class Analyzer
Github user gatorsmile commented on the pull request:
https://github.com/apache/spark/pull/9419#issuecomment-153192515
@rick-ibm Will add more comments to explain it. Especially, I will
emphasize this design will expect the optimizer collapses these two projections
into a single one
Github user gatorsmile commented on a diff in the pull request:
https://github.com/apache/spark/pull/9419#discussion_r44107333
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
---
@@ -232,7 +232,7 @@ class Analyzer
Github user gatorsmile commented on the pull request:
https://github.com/apache/spark/pull/9762#issuecomment-157786406
@cloud-fan @marmbrus Will follow your suggestions to update the fix.
Thanks!
---
If your project is set up for it, you can reply to this email and have your
reply
Github user gatorsmile commented on the pull request:
https://github.com/apache/spark/pull/9806#issuecomment-157749632
Your code looks pretty clean to me. Let me share my test cases this PR
failed.
```
test("joinWith tuple - self join 1") {
val ds = S
Github user gatorsmile commented on the pull request:
https://github.com/apache/spark/pull/9806#issuecomment-157807682
Sure. Will do. Thanks!
2015-11-18 10:16 GMT-08:00 Michael Armbrust <notificati...@github.com>:
> LGTM, merging to maste
Github user gatorsmile commented on the pull request:
https://github.com/apache/spark/pull/9762#issuecomment-157857086
@marmbrus @cloud-fan Based on your comments, I did the change. Please
review the new change.
I also tried the fix after excluding the change
Github user gatorsmile commented on the pull request:
https://github.com/apache/spark/pull/9717#issuecomment-156738825
retest this please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user gatorsmile commented on the pull request:
https://github.com/apache/spark/pull/9385#issuecomment-156730810
@marmbrus CachedTableSuite failed due to the same reason. We did not clean
up the subquery names. Thus, it is unable to give a correct result when
deciding
GitHub user gatorsmile opened a pull request:
https://github.com/apache/spark/pull/9717
[SPARK-9928][SQL] Removal of LogicalLocalTable
LogicalLocalTable in ExistingRDD.scala is replaced by localRelation in
LocalRelation.scala?
Do you know any reason why we still keep
Github user gatorsmile commented on the pull request:
https://github.com/apache/spark/pull/9717#issuecomment-156738584
The failure of this test case is not related to the code changes.
---
If your project is set up for it, you can reply to this email and have your
reply appear
Github user gatorsmile commented on the pull request:
https://github.com/apache/spark/pull/9717#issuecomment-156739572
@srowen Could you review the changes? Thanks!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well
Github user gatorsmile commented on the pull request:
https://github.com/apache/spark/pull/9717#issuecomment-156745623
Another case failed due to the same reasons.
```
[error] Test
org.apache.spark.ml.util.JavaDefaultReadWriteSuite.testDefaultReadWrite failed
GitHub user gatorsmile opened a pull request:
https://github.com/apache/spark/pull/9762
[SPARK-11633] [SQL] HiveContext's Case Insensitivity in Self-Join Handling
When handling self joins, the implementation did not consider the case
insensitivity of HiveContext. It could cause
GitHub user gatorsmile opened a pull request:
https://github.com/apache/spark/pull/9761
[SPARK-8658] [SQL] [FOLLOW-UP] AttributeReference's equals method compares
all the members
Based on the comment of @cloud-fan , update the AttributeReference's
hashCode function by including
GitHub user gatorsmile opened a pull request:
https://github.com/apache/spark/pull/10018
[SPARK-12028] [SQL] get_json_object returns an incorrect result when the
value is null literals
When calling `get_json_object` for the following two cases, both results
are `"
Github user gatorsmile commented on a diff in the pull request:
https://github.com/apache/spark/pull/10188#discussion_r46917559
--- Diff:
sql/core/src/test/java/test/org/apache/spark/sql/JavaDatasetSuite.java ---
@@ -386,6 +389,20 @@ public void testNestedTupleEncoder
Github user gatorsmile commented on the pull request:
https://github.com/apache/spark/pull/10092#issuecomment-161372323
@mateiz Thank you for your answer! Will try to do it soon.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub
Github user gatorsmile commented on a diff in the pull request:
https://github.com/apache/spark/pull/10116#discussion_r46501447
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/metric/SQLMetrics.scala
---
@@ -149,6 +149,32 @@ private[sql] object SQLMetrics
Github user gatorsmile commented on a diff in the pull request:
https://github.com/apache/spark/pull/10092#discussion_r46520645
--- Diff: python/pyspark/storagelevel.py ---
@@ -49,12 +51,8 @@ def __str__(self):
StorageLevel.DISK_ONLY = StorageLevel(True, False, False
Github user gatorsmile commented on the pull request:
https://github.com/apache/spark/pull/10092#issuecomment-161515703
Just saw the comments and will change the names soon. Thanks!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub
Github user gatorsmile commented on a diff in the pull request:
https://github.com/apache/spark/pull/9358#discussion_r46657105
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/encoders/Encoder.scala
---
@@ -37,3 +37,120 @@ trait Encoder[T] extends Serializable
Github user gatorsmile commented on a diff in the pull request:
https://github.com/apache/spark/pull/9358#discussion_r46650956
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/encoders/Encoder.scala
---
@@ -37,3 +37,120 @@ trait Encoder[T] extends Serializable
Github user gatorsmile commented on a diff in the pull request:
https://github.com/apache/spark/pull/9358#discussion_r46657310
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/encoders/Encoder.scala
---
@@ -37,3 +37,120 @@ trait Encoder[T] extends Serializable
Github user gatorsmile commented on the pull request:
https://github.com/apache/spark/pull/10092#issuecomment-161514977
- Removed all the constants whose `deserialized` values are true.
- Update the comments of StorageLevel
- Change the default storage levels of Kinesis level
Github user gatorsmile commented on the pull request:
https://github.com/apache/spark/pull/10092#issuecomment-161522366
Based on the comments of @mateiz , the extra changes are made:
- Renaming MEMORY_ONLY_SER to MEMORY_ONLY
- Renaming MEMORY_ONLY_SER_2 to MEMORY_ONLY_2
Github user gatorsmile commented on a diff in the pull request:
https://github.com/apache/spark/pull/10092#discussion_r46522595
--- Diff: python/pyspark/storagelevel.py ---
@@ -49,12 +51,8 @@ def __str__(self):
StorageLevel.DISK_ONLY = StorageLevel(True, False, False
Github user gatorsmile commented on the pull request:
https://github.com/apache/spark/pull/10160#issuecomment-162286394
@felixcheung @sun-rui Thank you! Based on your comments, I did the changes.
Please review the changes. : )
---
If your project is set up for it, you can
Github user gatorsmile commented on the pull request:
https://github.com/apache/spark/pull/10160#issuecomment-162328075
@felixcheung I am not sure if we need to add a test case for `sample`.
Normally, using a specific seed is the common way to verify the result of
`sample
GitHub user gatorsmile opened a pull request:
https://github.com/apache/spark/pull/10165
[SPARK-12164] [SQL] Display the binary/encoded values
When the dataset is encoded, the existing display looks strange. Decimal
format is not common when the type is binary
Github user gatorsmile commented on the pull request:
https://github.com/apache/spark/pull/10160#issuecomment-162286420
ok to test
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
GitHub user gatorsmile opened a pull request:
https://github.com/apache/spark/pull/10149
[SPARK-12150] [SQL] [Minor] Add range API without specifying the slice
number
For usability, add another sqlContext.range() method. Users can specify
start, end, and step without specifying
Github user gatorsmile commented on the pull request:
https://github.com/apache/spark/pull/10165#issuecomment-162400140
I have the exact same question when calling the show function. From the
perspectives of users, they might not care the encoded values at all when
calling
Github user gatorsmile commented on the pull request:
https://github.com/apache/spark/pull/10160#issuecomment-162415274
@felixcheung @shivaram Sure, just added that test case. Please review it.
Thank you! : )
---
If your project is set up for it, you can reply to this email and have
Github user gatorsmile commented on a diff in the pull request:
https://github.com/apache/spark/pull/10160#discussion_r46788421
--- Diff: R/pkg/R/DataFrame.R ---
@@ -677,13 +677,15 @@ setMethod("unique",
#' collect(sample(df, TRUE, 0.5))
#'}
setMeth
Github user gatorsmile commented on a diff in the pull request:
https://github.com/apache/spark/pull/10160#discussion_r46789803
--- Diff: R/pkg/R/DataFrame.R ---
@@ -692,8 +696,8 @@ setMethod("sample",
setMethod("sample_frac",
sign
Github user gatorsmile commented on the pull request:
https://github.com/apache/spark/pull/10160#issuecomment-162428939
@shivaram @felixcheung @sun-rui Thank you everyone! Hopefully, my code
changes resolve all your concerns. I learned a lot from you! : )
---
If your project is set
Github user gatorsmile commented on a diff in the pull request:
https://github.com/apache/spark/pull/10160#discussion_r46789542
--- Diff: R/pkg/R/DataFrame.R ---
@@ -692,8 +696,8 @@ setMethod("sample",
setMethod("sample_frac",
sign
Github user gatorsmile commented on the pull request:
https://github.com/apache/spark/pull/10149#issuecomment-163093870
@marmbrus @cloud-fan This PR changes the external API. Not sure if this
will be merged or we should revisit it after the release of 1.6? Thank you!
---
If your
GitHub user gatorsmile opened a pull request:
https://github.com/apache/spark/pull/10215
[SPARK-12164] [SQL] Decode the encoded values and then display
Based on the suggestions from @marmbrus @cloud-fan in
https://github.com/apache/spark/pull/10165 , this PR is to print the decoded
Github user gatorsmile commented on a diff in the pull request:
https://github.com/apache/spark/pull/10184#discussion_r47046420
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ---
@@ -429,18 +432,18 @@ class Dataset[T] private[sql
Github user gatorsmile commented on a diff in the pull request:
https://github.com/apache/spark/pull/10184#discussion_r47046431
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ---
@@ -67,15 +67,21 @@ class Dataset[T] private[sql](
tEncoder: Encoder[T
Github user gatorsmile commented on a diff in the pull request:
https://github.com/apache/spark/pull/10184#discussion_r47046739
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ---
@@ -67,15 +67,21 @@ class Dataset[T] private[sql](
tEncoder: Encoder[T
GitHub user gatorsmile opened a pull request:
https://github.com/apache/spark/pull/10214
[SPARK-12188] [SQL] [FOLLOW-UP] Code refactoring and comment correction in
Dataset APIs
@marmbrus This PR is to address your comment. Thanks for your review!
You can merge this pull request
Github user gatorsmile commented on the pull request:
https://github.com/apache/spark/pull/10165#issuecomment-163093753
Thank you! @cloud-fan
Will this PR be merged to 1.6? Or waiting for another PR for showing
decoded values? @marmbrus Thank you!
---
If your project
GitHub user gatorsmile opened a pull request:
https://github.com/apache/spark/pull/10160
[SPARK-12158] [R] [SQL] Fix 'sample' functions that break R unit test cases
The existing sample functions miss the parameter 'seed', however, the
corresponding function interface in `generics
Github user gatorsmile commented on the pull request:
https://github.com/apache/spark/pull/10155#issuecomment-162140125
Weird...
My code changes are not related to the failed test case in SparkR.
```
count(sampled3) < 3 isn't true
```
---
If your project is
Github user gatorsmile commented on the pull request:
https://github.com/apache/spark/pull/10155#issuecomment-162140292
retest this please
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user gatorsmile commented on the pull request:
https://github.com/apache/spark/pull/10155#issuecomment-162148385
Found a bug in the function `sample` of R. Will submit a PR later. Thanks!
---
If your project is set up for it, you can reply to this email and have your
reply
Github user gatorsmile commented on the pull request:
https://github.com/apache/spark/pull/10160#issuecomment-162241856
@davies Could you take a look at this PR? Thank you!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well
Github user gatorsmile commented on a diff in the pull request:
https://github.com/apache/spark/pull/10060#discussion_r46624363
--- Diff: docs/sql-programming-guide.md ---
@@ -9,18 +9,51 @@ title: Spark SQL and DataFrames
# Overview
-Spark SQL is a Spark module
1 - 100 of 14035 matches
Mail list logo