[GitHub] spark pull request: [SPARK-10838][SPARK-11576][SQL][WIP] Incorrect...

2016-01-11 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/9548#issuecomment-170755883 @gatorsmile we will revisit this in the future. Do you mind closing the pull request for now? --- If your project is set up for it, you can reply to this email and have

[GitHub] spark pull request: [SPARK-10838][SPARK-11576][SQL][WIP] Incorrect...

2016-01-11 Thread gatorsmile
Github user gatorsmile commented on the pull request: https://github.com/apache/spark/pull/9548#issuecomment-170756672 Ok, let me close it. Thank you! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-10838][SPARK-11576][SQL][WIP] Incorrect...

2016-01-11 Thread gatorsmile
Github user gatorsmile closed the pull request at: https://github.com/apache/spark/pull/9548 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: [SPARK-10838][SPARK-11576][SQL][WIP] Incorrect...

2015-11-11 Thread gatorsmile
Github user gatorsmile commented on the pull request: https://github.com/apache/spark/pull/9548#issuecomment-155835390 @cloud-fan Before discussing the solution details, let us first talk about the design issues. IMO, the `DataFrame` is a query language, kind of a dialect

[GitHub] spark pull request: [SPARK-10838][SPARK-11576][SQL][WIP] Incorrect...

2015-11-11 Thread gatorsmile
Github user gatorsmile commented on the pull request: https://github.com/apache/spark/pull/9548#issuecomment-155912100 @cloud-fan So far, we do not have an easy fix, but I believe we should never return a wrong result for self join. Let me post the test case I added. This

[GitHub] spark pull request: [SPARK-10838][SPARK-11576][SQL][WIP] Incorrect...

2015-11-09 Thread gatorsmile
Github user gatorsmile commented on the pull request: https://github.com/apache/spark/pull/9548#issuecomment-155183305 I can't fix the problem without a major code change. The current design of dataFrame has a fundamental problem. When using column references, we might hit various

[GitHub] spark pull request: [SPARK-10838][SPARK-11576][SQL][WIP] Incorrect...

2015-11-09 Thread gatorsmile
Github user gatorsmile commented on the pull request: https://github.com/apache/spark/pull/9548#issuecomment-155226523 @marmbrus Thank you for your suggestions! That is also like my initial idea. I did a try last night. Unfortunately, I hit a problem when adding such a field

[GitHub] spark pull request: [SPARK-10838][SPARK-11576][SQL][WIP] Incorrect...

2015-11-08 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9548#issuecomment-154881540 **[Test build #45319 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45319/consoleFull)** for PR 9548 at commit

[GitHub] spark pull request: [SPARK-10838][SPARK-11576][SQL][WIP] Incorrect...

2015-11-08 Thread gatorsmile
GitHub user gatorsmile opened a pull request: https://github.com/apache/spark/pull/9548 [SPARK-10838][SPARK-11576][SQL][WIP] Incorrect results or exceptions when using self-joins When resolving the attributeReference's ambiguity caused by self joins, the current solution only

[GitHub] spark pull request: [SPARK-10838][SPARK-11576][SQL][WIP] Incorrect...

2015-11-08 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9548#issuecomment-154883241 **[Test build #45319 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45319/consoleFull)** for PR 9548 at commit

[GitHub] spark pull request: [SPARK-10838][SPARK-11576][SQL][WIP] Incorrect...

2015-11-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9548#issuecomment-154883261 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-10838][SPARK-11576][SQL][WIP] Incorrect...

2015-11-08 Thread gatorsmile
Github user gatorsmile commented on the pull request: https://github.com/apache/spark/pull/9548#issuecomment-154881441 Since this solution requires adding quantifier comparison into the equation of attributeReferences, this will fail a couple test cases in expand. We have

[GitHub] spark pull request: [SPARK-10838][SPARK-11576][SQL][WIP] Incorrect...

2015-11-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9548#issuecomment-154879753 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-10838][SPARK-11576][SQL][WIP] Incorrect...

2015-11-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9548#issuecomment-154879759 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-10838][SPARK-11576][SQL][WIP] Incorrect...

2015-11-08 Thread gatorsmile
Github user gatorsmile commented on the pull request: https://github.com/apache/spark/pull/9548#issuecomment-154911463 To fix these failed cases, I will move the dataFrame's hashCode to the Column class, instead of directly putting the values to quantifiers. --- If your project is