[GitHub] spark pull request: [SPARK-9935][SQL] EqualNotNull not processed i...

2015-08-13 Thread HyukjinKwon
GitHub user HyukjinKwon opened a pull request: https://github.com/apache/spark/pull/8163 [SPARK-9935][SQL] EqualNotNull not processed in ORC https://issues.apache.org/jira/browse/SPARK-9935 You can merge this pull request into a Git repository by running: $ git pull https

[GitHub] spark pull request: [SPARK-10035][SQL] Parquet filters does not pr...

2015-08-18 Thread HyukjinKwon
GitHub user HyukjinKwon opened a pull request: https://github.com/apache/spark/pull/8275 [SPARK-10035][SQL] Parquet filters does not process EqualNullSafe filter. As I talked with Lian, 1. I added EquelNullSafe to ParquetFilters - It uses the same equality comparison

[GitHub] spark pull request: [SPARK-10035][SQL] Parquet filters does not pr...

2015-08-19 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/8275#issuecomment-132819486 Could you merge this :) ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-10180] [SQL] JDBCRDD does not process E...

2015-08-24 Thread HyukjinKwon
GitHub user HyukjinKwon opened a pull request: https://github.com/apache/spark/pull/8391 [SPARK-10180] [SQL] JDBCRDD does not process EqualNullSafe filter. https://issues.apache.org/jira/browse/SPARK-10180 You can merge this pull request into a Git repository by running: $ git

[GitHub] spark pull request: [SPARK-9814][SQL] EqualNotNull not passing to ...

2015-08-11 Thread HyukjinKwon
GitHub user HyukjinKwon opened a pull request: https://github.com/apache/spark/pull/8096 [SPARK-9814][SQL] EqualNotNull not passing to data sources You can merge this pull request into a Git repository by running: $ git pull https://github.com/HyukjinKwon/spark master

[GitHub] spark pull request: [SPARK-9814][SQL] EqualNotNull not passing to ...

2015-08-11 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/8096#discussion_r36715585 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/sources/filters.scala --- @@ -38,6 +38,14 @@ case class EqualTo(attribute: String, value: Any

[GitHub] spark pull request: [SPARK-9814][SQL] EqualNotNull not passing to ...

2015-08-11 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/8096#issuecomment-129835273 Reset this please :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-10180] [SQL] JDBCRDD does not process E...

2015-08-25 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/8391#issuecomment-134615781 Hm.. one thing I want to say is, it looks like there is no test code for JdbcRelation. So I tested this with seperate copied functions. It looks just about text

[GitHub] spark pull request: [SPARK-11103][SQL] Filter applied on Merged Pa...

2015-10-28 Thread HyukjinKwon
GitHub user HyukjinKwon opened a pull request: https://github.com/apache/spark/pull/9327 [SPARK-11103][SQL] Filter applied on Merged Parquet shema with new column fail When enabling mergedSchema and predicate filter, this fails since Parquet filters are pushed down regardless

[GitHub] spark pull request: [SPARK-11103][SQL] Filter applied on Merged Pa...

2015-10-28 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/9327#issuecomment-152054929 /cc @liancheng --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...

2015-11-11 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/9060#issuecomment-155720490 I will try to find and test them first tommorow before adding a commit! --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark pull request: [SPARK-11500][SQL] Not deterministic order of ...

2015-11-09 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/9517#issuecomment-155280757 I used `sortBy` instead of `sortWith` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-11500][SQL] Not deterministic order of ...

2015-11-08 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/9517#issuecomment-154890873 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-9557][SQL] Refactor ParquetFilterSuite ...

2015-11-08 Thread HyukjinKwon
GitHub user HyukjinKwon opened a pull request: https://github.com/apache/spark/pull/9554 [SPARK-9557][SQL] Refactor ParquetFilterSuite and remove old ParquetFilters code Actually this was resolved by https://github.com/apache/spark/pull/8275. But I found the JIRA issue

[GitHub] spark pull request: [SPARK-11500][SQL] Not deterministic order of ...

2015-11-08 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/9517#discussion_r44247117 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetRelation.scala --- @@ -461,13 +461,29 @@ private[sql] class

[GitHub] spark pull request: [SPARK-11500][SQL] Not deterministic order of ...

2015-11-08 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/9517#discussion_r44247087 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/sources/ParquetHadoopFsRelationSuite.scala --- @@ -155,4 +155,22 @@ class

[GitHub] spark pull request: [SPARK-10113][SQL] Support for unsigned Parque...

2015-11-11 Thread HyukjinKwon
GitHub user HyukjinKwon opened a pull request: https://github.com/apache/spark/pull/9646 [SPARK-10113][SQL] Support for unsigned Parquet logical types Parquet supports some unsigned datatypes. However, Since Spark does not support unsigned datatypes, it needs to emit an exception

[GitHub] spark pull request: [SPARK-11661] [SQL] Still pushdown filters ret...

2015-11-11 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/9634#discussion_r44612005 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFilterSuite.scala --- @@ -336,4 +336,29 @@ class

[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...

2015-11-11 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/9060#issuecomment-155973698 @liancheng I give some tries to figure out the version but.. as you said, it is pretty tricky to check the writer version as it only changes the version of data

[GitHub] spark pull request: [SPARK-11661] [SQL] Still pushdown filters ret...

2015-11-11 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/9634#discussion_r44607471 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFilterSuite.scala --- @@ -336,4 +336,29 @@ class

[GitHub] spark pull request: [SPARK-10113][SQL] Support for unsigned Parque...

2015-11-11 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/9646#issuecomment-155986820 cc @liancheng --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-11692][SQL] Support for Parquet logical...

2015-11-12 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/9658#issuecomment-156046292 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-11687] Mixed usage of fold and foldLeft...

2015-11-11 Thread HyukjinKwon
GitHub user HyukjinKwon opened a pull request: https://github.com/apache/spark/pull/9655 [SPARK-11687] Mixed usage of fold and foldLeft, reduce and reduceLeft and reduceOption and reduceLeftOption https://issues.apache.org/jira/browse/SPARK-11687 As can be seen here https

[GitHub] spark pull request: [SPARK-11692][SQL] Support for Parquet logical...

2015-11-12 Thread HyukjinKwon
GitHub user HyukjinKwon opened a pull request: https://github.com/apache/spark/pull/9658 [SPARK-11692][SQL] Support for Parquet logical types, JSON and BSON (embedded types) Parquet supports some JSON and BSON datatypes. They are represented as binary for BSON and string (UTF-8

[GitHub] spark pull request: [SPARK-11676][SQL] Parquet filter tests all pa...

2015-11-12 Thread HyukjinKwon
GitHub user HyukjinKwon opened a pull request: https://github.com/apache/spark/pull/9659 [SPARK-11676][SQL] Parquet filter tests all pass if filters are not really pushed down Currently Parquet predicate tests all pass even if filters are not pushed down or this is disabled

[GitHub] spark pull request: [SPARK-11692][SQL] Support for Parquet logical...

2015-11-12 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/9658#issuecomment-156052665 cc @liancheng --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-11694][SQL] Parquet logical types are n...

2015-11-12 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/9660#discussion_r44737977 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetIOSuite.scala --- @@ -91,6 +91,33 @@ class ParquetIOSuite

[GitHub] spark pull request: [SPARK-11687] Mixed usage of fold and foldLeft...

2015-11-12 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/9655#issuecomment-156262080 I think I should have opened a thread in the mailing list. Sorry, closing this. --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark pull request: [SPARK-11687] Mixed usage of fold and foldLeft...

2015-11-12 Thread HyukjinKwon
Github user HyukjinKwon closed the pull request at: https://github.com/apache/spark/pull/9655 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request: [SPARK-11694][SQL] Parquet logical types are n...

2015-11-12 Thread HyukjinKwon
GitHub user HyukjinKwon opened a pull request: https://github.com/apache/spark/pull/9660 [SPARK-11694][SQL] Parquet logical types are not being tested properly All the physical types are properly tested at `ParquetIOSuite` but logical type mapping is not being tested. You can

[GitHub] spark pull request: [SPARK-11694][SQL] Parquet logical types are n...

2015-11-12 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/9660#issuecomment-156059610 cc @liancheng --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-11694][SQL] Parquet logical types are n...

2015-11-12 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/9660#discussion_r44650410 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetIOSuite.scala --- @@ -91,6 +91,32 @@ class ParquetIOSuite

[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...

2015-11-12 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/9060#issuecomment-156076494 Thank toy very much. I will try in that way. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-11687] Mixed usage of fold and foldLeft...

2015-11-12 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/9655#issuecomment-156087424 I agree with you. However, if you see the codes, they are used in a mixed way, even in the same class. Mostly the usages are `_ and _` or ` + ` but `xxx

[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...

2015-11-12 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/9060#issuecomment-156306727 Fortunately, I worked around parquet tools once and looked through Parquet codes several times :). Thank you very much for your help. This could be dome

[GitHub] spark pull request: [SPARK-11694][SQL] Parquet logical types are n...

2015-11-12 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/9660#issuecomment-156321461 Also added an import `Collections` at `ParquetIOSuite`. Does this also compile the old version of all commits in this PR? --- If your project is set up

[GitHub] spark pull request: [SPARK-11692][SQL] Support for Parquet logical...

2015-11-12 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/9658#issuecomment-156355260 All the builds pass all the tests at `ParquetIOSuite` and I do not think it affects other modules such as ML. I will retest this. --- If your project is set up

[GitHub] spark pull request: [SPARK-11692][SQL] Support for Parquet logical...

2015-11-12 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/9658#issuecomment-156355277 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-11692][SQL] Support for Parquet logical...

2015-11-12 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/9658#issuecomment-156320809 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-11677][SQL ]ORC filter tests all pass i...

2015-11-12 Thread HyukjinKwon
GitHub user HyukjinKwon opened a pull request: https://github.com/apache/spark/pull/9687 [SPARK-11677][SQL ]ORC filter tests all pass if filters are actually not pushed down. Currently ORC filters are not tested properly. All the tests pass even if the filters are not pushed down

[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...

2015-11-12 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/9060#issuecomment-156099372 Thanks! I will follow the way. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-11694][SQL] Parquet logical types are n...

2015-11-13 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/9660#issuecomment-156416602 Finally I got all-pass! Pleace review the codes! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request: [SPARK-11694][SQL] Parquet logical types are n...

2015-11-13 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/9660#issuecomment-156371302 restart this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-11694][SQL] Parquet logical types are n...

2015-11-12 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/9660#issuecomment-156296565 hm. I'm not using `FileMetaData` at line 234 in `ParquetIOSuite`. --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark pull request: [SPARK-11694][SQL] Parquet logical types are n...

2015-11-12 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/9660#issuecomment-156309901 I added an import `FileMetaData` at ParquetIOSuite. This is weird, I am suing scala 2.10 and it compiles okay in my local computer. --- If your project is set

[GitHub] spark pull request: [SPARK-11694][SQL] Parquet logical types are n...

2015-11-12 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/9660#issuecomment-156296578 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-11694][SQL] Parquet logical types are n...

2015-11-12 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/9660#issuecomment-156313536 Also added an import `ParquetMetadata` at `ParquetIOSuite` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark pull request: [SPARK-11694][FOLLOW-UP] Clean up imports, use...

2015-11-16 Thread HyukjinKwon
GitHub user HyukjinKwon opened a pull request: https://github.com/apache/spark/pull/9754 [SPARK-11694][FOLLOW-UP] Clean up imports, use a common function for metadata and add a test for FIXED_LEN_BYTE_ARRAY As discussed https://github.com/apache/spark/pull/9660 https://github.com

[GitHub] spark pull request: [SPARK-11694][FOLLOW-UP] Clean up imports, use...

2015-11-16 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/9754#issuecomment-157276165 I will cc you @liancheng just so that you can easily find :). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...

2015-11-15 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/9060#issuecomment-156879272 I saw accidently `TODO Adds test case for reading dictionary encoded decimals written as 'FIXED_LEN_BYTE_ARRAY'`. I will also add this test in the following

[GitHub] spark pull request: [SPARK-11692][SQL] Support for Parquet logical...

2015-11-15 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/9658#issuecomment-156878712 Thanks! I changed this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-11661] [SQL] Still pushdown filters ret...

2015-11-11 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/9634#discussion_r44598625 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFilterSuite.scala --- @@ -336,4 +336,29 @@ class

[GitHub] spark pull request: [SPARK-11661] [SQL] Still pushdown filters ret...

2015-11-11 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/9634#discussion_r44600842 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFilterSuite.scala --- @@ -336,4 +336,29 @@ class

[GitHub] spark pull request: [SPARK-11692][SQL] Support for Parquet logical...

2015-11-17 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/9658#issuecomment-157373552 Oh. I just got confused for a bit. Sorry. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-11692][SQL] Support for Parquet logical...

2015-11-17 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/9658#issuecomment-157341844 Please note that https://github.com/apache/spark/pull/9754 updated unintentionally this to clean up at mater branch however, that is supposed to be merged

[GitHub] spark pull request: [SPARK-11692] [SPARK-11694] [SQL] Backports #9...

2015-11-17 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/9763#issuecomment-157335573 I mistakenly added SPARK-11692 here as I though this is supposed to go to branch 1.6.0 but this is classified for version 1.7. I will take this out

[GitHub] spark pull request: [SPARK-11500][SQL] Not deterministic order of ...

2015-11-09 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/9517#issuecomment-155020044 In this commit, I added partitioned tables for the test and sorted the `FileStatus`es. There are several things to mention here. Firstly, now we do

[GitHub] spark pull request: [SPARK-11103][SQL] Filter applied on Merged Pa...

2015-10-29 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/9327#issuecomment-152424928 @liancheng oh, right. I just added at `ParquetFilterSuite` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark pull request: [SPARK-11103][SQL] Filter applied on Merged Pa...

2015-11-03 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/9327#discussion_r43724135 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFilterSuite.scala --- @@ -314,4 +314,24 @@ class

[GitHub] spark pull request: [SPARK-11103][SQL] Filter applied on Merged Pa...

2015-10-30 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/9327#discussion_r43488432 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFilterSuite.scala --- @@ -314,4 +314,24 @@ class

[GitHub] spark pull request: [SPARK-11103][SQL] Filter applied on Merged Pa...

2015-11-01 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/9327#discussion_r43600244 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFilterSuite.scala --- @@ -314,4 +314,24 @@ class

[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...

2015-11-06 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/9060#issuecomment-154597634 @liancheng I assume you missed this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: [SPARK-11103][SQL] Filter applied on Merged Pa...

2015-11-03 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/9327#discussion_r43846677 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFilterSuite.scala --- @@ -314,4 +314,24 @@ class

[GitHub] spark pull request: [SPARK-11500][SQL] Not deterministic order of ...

2015-11-05 Thread HyukjinKwon
GitHub user HyukjinKwon opened a pull request: https://github.com/apache/spark/pull/9517 [SPARK-11500][SQL] Not deterministic order of columns when using merging schemas. https://issues.apache.org/jira/browse/SPARK-11500 As filed in SPARK-11500, if merging schemas

[GitHub] spark pull request: [SPARK-11500][SQL] Not deterministic order of ...

2015-11-05 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/9517#issuecomment-154338571 cc @liancheng --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-9182][SQL] Cast filters are not passed ...

2015-10-07 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/8718#issuecomment-146380223 @liancheng Would this casting check be unsafe? I came across Parquet downcasting check with the actual value ```java public static int checkedCast(long

[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...

2015-10-10 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/9060#discussion_r41705069 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/CatalystWriteSupport.scala --- @@ -431,6 +431,7 @@ private[parquet

[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...

2015-10-10 Thread HyukjinKwon
GitHub user HyukjinKwon opened a pull request: https://github.com/apache/spark/pull/9060 [SPARK-11044][SQL] Parquet writer version fixed as version1 https://issues.apache.org/jira/browse/SPARK-11044 Spark only writes the parquet file with writer version1 ignoring the given

[GitHub] spark pull request: [SPARK-9182][SQL] Cast filters are not passed ...

2015-10-13 Thread HyukjinKwon
Github user HyukjinKwon closed the pull request at: https://github.com/apache/spark/pull/8718 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request: [SPARK-11677][SQL] ORC filter tests all pass i...

2015-11-17 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/9687#issuecomment-157561940 Several things to mention. Firstly, I wonder if it is okay to put `extractSourceRDDToDataFrame` at `QueryTest`. I did not put but I think

[GitHub] spark pull request: [SPARK-11677][SQL] ORC filter tests all pass i...

2015-11-17 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/9687#issuecomment-157564511 cc @liancheng --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-11694][FOLLOW-UP] Clean up imports, use...

2015-11-16 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/9754#issuecomment-157300557 yes I will. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-11692] [SPARK-11694] [SQL] Backports #9...

2015-11-17 Thread HyukjinKwon
GitHub user HyukjinKwon opened a pull request: https://github.com/apache/spark/pull/9763 [SPARK-11692] [SPARK-11694] [SQL] Backports #9658 and #9754 The main purpose of this PR is to backport https://github.com/apache/spark/pull/9754 and https://github.com/apache/spark/pull/9658

[GitHub] spark pull request: [SPARK-10180] [SQL] JDBCRDD does not process E...

2015-08-24 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/8391#discussion_r37833992 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCRDD.scala --- @@ -275,6 +275,7 @@ private[sql] class JDBCRDD

[GitHub] spark pull request: [SPARK-10180] [SQL] JDBCRDD does not process E...

2015-08-24 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/8391#discussion_r37830657 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCRDD.scala --- @@ -275,6 +275,7 @@ private[sql] class JDBCRDD

[GitHub] spark pull request: [SPARK-10180] [SQL] JDBCRDD does not process E...

2015-08-25 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/8391#issuecomment-134564538 Hm.. one thing I want to say is, it looks like there is no test code for JdbcRelation. So I tested this with seperate copied functions. It looks just about text

[GitHub] spark pull request: [SPARK-10180] [SQL] JDBCRDD does not process E...

2015-08-24 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/8391#discussion_r37831947 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCRDD.scala --- @@ -275,6 +275,7 @@ private[sql] class JDBCRDD

[GitHub] spark pull request: [SPARK-10180] [SQL] JDBCRDD does not process E...

2015-09-13 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/8391#discussion_r39345066 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCRDD.scala --- @@ -275,6 +275,10 @@ private[sql] class JDBCRDD

[GitHub] spark pull request: [SPARK-10180] [SQL] JDBCRDD does not process E...

2015-09-13 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/8391#discussion_r39345352 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCRDD.scala --- @@ -275,6 +275,10 @@ private[sql] class JDBCRDD

[GitHub] spark pull request: [SPARK-10180] [SQL] JDBC datasource are not pr...

2015-09-14 Thread HyukjinKwon
GitHub user HyukjinKwon opened a pull request: https://github.com/apache/spark/pull/8743 [SPARK-10180] [SQL] JDBC datasource are not processing EqualNullSafe filter https://github.com/apache/spark/pull/8391 @rxin I apologize that I removed the forked repo by mistake

[GitHub] spark pull request: [SPARK-9182][SQL] Cast filters are not passed ...

2015-09-16 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/8718#issuecomment-140693582 It is OK for JDBC but for Parquet and ORC, it looks the conversion from `StringType` to `NumericType` are not safe. When the field type is `StringType

[GitHub] spark pull request: [SPARK-9182][SQL] Cast filters are not passed ...

2015-09-15 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/8718#issuecomment-140569493 It look ok to JDBC datasource tough. I am wondering if conversion from `NumericType` to `StringType` should be prevented or treated for each differently as it looks

[GitHub] spark pull request: [SPARK-9182][SQL] Cast filters are not passed ...

2015-09-11 Thread HyukjinKwon
GitHub user HyukjinKwon opened a pull request: https://github.com/apache/spark/pull/8718 [SPARK-9182][SQL] Cast filters are not passed through to datasources As mentioned in https://issues.apache.org/jira/browse/SPARK-9182, Some casts filters are not passing to datasources

[GitHub] spark pull request: [SPARK-9182][SQL] Cast filters are not passed ...

2015-09-15 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/8718#issuecomment-140368436 I think in that cast it does not create any condition as `double -> int -> double` means, the field type is int and the given value type is double. Losing pre

[GitHub] spark pull request: [SPARK-10180] [SQL] JDBC datasource are not pr...

2015-09-16 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/8743#issuecomment-140919876 Should I better write a test code for this? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-10180] [SQL] JDBCRDD does not process E...

2015-09-14 Thread HyukjinKwon
Github user HyukjinKwon closed the pull request at: https://github.com/apache/spark/pull/8391 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request: [SPARK-9182][SQL] Cast filters are not passed ...

2015-09-19 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/8718#issuecomment-141679720 Yes I will do so. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-10180] [SQL] JDBC datasource are not pr...

2015-09-19 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/8743#issuecomment-141679756 Should I better write some test codes for this? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request: [SPARK-9182][SQL] Cast filters are not passed ...

2015-09-21 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/8718#issuecomment-141960719 It looks the original case became a downcast `Decimal(10, 0)` to `Decimal(7, 2)` which seems when scale and precision of the latter are less than the former, rather

[GitHub] spark pull request: [SPARK-9182][SQL] Cast filters are not passed ...

2015-09-25 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/8718#issuecomment-143179571 In this commit, it only deals with numbers. I removed the roundtrip in cast and It only supports comparisons among other numeric types except `Decimal` and between

[GitHub] spark pull request: [SPARK-11676][SQL] Parquet filter tests all pa...

2015-12-08 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/9659#discussion_r47059656 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFilterSuite.scala --- @@ -50,27 +50,33 @@ class

[GitHub] spark pull request: [SPARK-12236][SQL] JDBC filter tests all pass ...

2015-12-08 Thread HyukjinKwon
GitHub user HyukjinKwon opened a pull request: https://github.com/apache/spark/pull/10221 [SPARK-12236][SQL] JDBC filter tests all pass if filters are not really pushed down https://issues.apache.org/jira/browse/SPARK-12236 Currently JDBC filters are not tested properly. All

[GitHub] spark pull request: [SPARK-10180] [SQL] JDBC datasource are not pr...

2015-12-08 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/8743#issuecomment-163142135 Just a question. Now it looks the PR for the end-to-end docker tests is merged. Do you think it needs all the tests for all the databases (namely

[GitHub] spark pull request: [SPARK-11676][SQL] Parquet filter tests all pa...

2015-12-08 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/9659#issuecomment-163088089 In this commit, I resolved conflicts, renamed the function extractSourceRDDToDataFrame to stripSparkFilter and set false

[GitHub] spark pull request: [SPARK-11677][SQL] ORC filter tests all pass i...

2015-12-08 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/9687#issuecomment-163081502 I will add that function to `SharedSQLContext` if it is okay. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark pull request: [SPARK-12227][SQL] Support drop multiple colum...

2015-12-08 Thread HyukjinKwon
GitHub user HyukjinKwon opened a pull request: https://github.com/apache/spark/pull/10218 [SPARK-12227][SQL] Support drop multiple columns specified by Column class in DataFrame API https://issues.apache.org/jira/browse/SPARK-12227 In this PR, I added the support to drop

[GitHub] spark pull request: [SPARK-11676][SQL] Parquet filter tests all pa...

2015-12-08 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/9659#issuecomment-163102930 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-11677][SQL] ORC filter tests all pass i...

2015-12-08 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/9687#issuecomment-163089723 In this commit, I renamed the function extractSourceRDDToDataFrame to stripSparkFilter. --- If your project is set up for it, you can reply to this email and have

[GitHub] spark pull request: [SPARK-11677][SQL] ORC filter tests all pass i...

2015-12-08 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/9687#issuecomment-163104687 Can I add the function `stripSparkFilter` to `SQLTestUtils` for `ParquetFilterSuite` and `OrcQuerySuite` (and possibly `OrcFilterSuite` I will make)? --- If your

[GitHub] spark pull request: [SPARK-12249][SQL] JDBC not-equality compariso...

2015-12-09 Thread HyukjinKwon
GitHub user HyukjinKwon opened a pull request: https://github.com/apache/spark/pull/10233 [SPARK-12249][SQL] JDBC not-equality comparison operator not pushed down. https://issues.apache.org/jira/browse/SPARK-12249 Currently `!=` operator is not pushed down correctly. I

  1   2   3   4   5   6   7   8   9   10   >