spark git commit: [SPARK-9950] [SQL] Wrong Analysis Error for grouping/aggregating on struct fields

2015-08-17 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master f68d02409 -> a4acdabb1 [SPARK-9950] [SQL] Wrong Analysis Error for grouping/aggregating on struct fields This issue has been fixed by https://github.com/apache/spark/pull/8215, this PR added regression test for it. Author: Wenchen Fan

spark git commit: [SPARK-9955] [SQL] correct error message for aggregate

2015-08-15 Thread marmbrus
Repository: spark Updated Branches: refs/heads/branch-1.5 1a6f0af9f -> 2fda1d842 [SPARK-9955] [SQL] correct error message for aggregate We should skip unresolved `LogicalPlan`s for `PullOutNondeterministic`, as calling `output` on unresolved `LogicalPlan` will produce confusing error message

spark git commit: [SPARK-9955] [SQL] correct error message for aggregate

2015-08-15 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master a85fb6c07 -> 570567258 [SPARK-9955] [SQL] correct error message for aggregate We should skip unresolved `LogicalPlan`s for `PullOutNondeterministic`, as calling `output` on unresolved `LogicalPlan` will produce confusing error message. A

spark git commit: [SPARK-8670] [SQL] Nested columns can't be referenced in pyspark

2015-08-14 Thread marmbrus
Repository: spark Updated Branches: refs/heads/branch-1.5 0f4ccdc4c -> 5bbb2d327 [SPARK-8670] [SQL] Nested columns can't be referenced in pyspark This bug is caused by a wrong column-exist-check in `__getitem__` of pyspark dataframe. `DataFrame.apply` accepts not only top level column names,

spark git commit: [SPARK-8670] [SQL] Nested columns can't be referenced in pyspark

2015-08-14 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 2a6590e51 -> 1150a19b1 [SPARK-8670] [SQL] Nested columns can't be referenced in pyspark This bug is caused by a wrong column-exist-check in `__getitem__` of pyspark dataframe. `DataFrame.apply` accepts not only top level column names, but

spark git commit: [SPARK-9561] Re-enable BroadcastJoinSuite

2015-08-14 Thread marmbrus
Repository: spark Updated Branches: refs/heads/branch-1.5 e2a288cc3 -> b2842138c [SPARK-9561] Re-enable BroadcastJoinSuite We can do this now that SPARK-9580 is resolved. Author: Andrew Or Closes #8208 from andrewor14/reenable-sql-tests. (cherry picked from commit ece00566e4d5f38585f2810be

spark git commit: [SPARK-9561] Re-enable BroadcastJoinSuite

2015-08-14 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 3bc552872 -> ece00566e [SPARK-9561] Re-enable BroadcastJoinSuite We can do this now that SPARK-9580 is resolved. Author: Andrew Or Closes #8208 from andrewor14/reenable-sql-tests. Project: http://git-wip-us.apache.org/repos/asf/spark/r

spark git commit: [SPARK-9449] [SQL] Include MetastoreRelation's inputFiles

2015-08-12 Thread marmbrus
Repository: spark Updated Branches: refs/heads/branch-1.5 ed73f5439 -> 3298fb69f [SPARK-9449] [SQL] Include MetastoreRelation's inputFiles Author: Michael Armbrust Closes #8119 from marmbrus/metastoreInputFiles. (cherry picked from commit 660e6dcff8125b83cc73dbe00c90cbe58744bc66

spark git commit: [SPARK-9449] [SQL] Include MetastoreRelation's inputFiles

2015-08-12 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master fc1c7fd66 -> 660e6dcff [SPARK-9449] [SQL] Include MetastoreRelation's inputFiles Author: Michael Armbrust Closes #8119 from marmbrus/metastoreInputFiles. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http:/

spark git commit: [SPARK-9894] [SQL] Json writer should handle MapData.

2015-08-12 Thread marmbrus
Repository: spark Updated Branches: refs/heads/branch-1.5 74c9dcec3 -> 08f767a1e [SPARK-9894] [SQL] Json writer should handle MapData. https://issues.apache.org/jira/browse/SPARK-9894 Author: Yin Huai Closes #8137 from yhuai/jsonMapData. (cherry picked from commit 7035d880a0cf06910c19b4afd

spark git commit: [SPARK-9894] [SQL] Json writer should handle MapData.

2015-08-12 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master ab7e721cf -> 7035d880a [SPARK-9894] [SQL] Json writer should handle MapData. https://issues.apache.org/jira/browse/SPARK-9894 Author: Yin Huai Closes #8137 from yhuai/jsonMapData. Project: http://git-wip-us.apache.org/repos/asf/spark/r

spark git commit: [SPARK-9726] [PYTHON] PySpark DF join no longer accepts on=None

2015-08-12 Thread marmbrus
Repository: spark Updated Branches: refs/heads/branch-1.5 b515f890d -> 8629c33b6 [SPARK-9726] [PYTHON] PySpark DF join no longer accepts on=None rxin First pull request for Spark so let me know if I am missing anything The contribution is my original work and I license the work to the project

spark git commit: [SPARK-9726] [PYTHON] PySpark DF join no longer accepts on=None

2015-08-12 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 70fe55886 -> 60103ecd3 [SPARK-9726] [PYTHON] PySpark DF join no longer accepts on=None rxin First pull request for Spark so let me know if I am missing anything The contribution is my original work and I license the work to the project un

spark git commit: [SPARK-9804] [HIVE] Use correct value for isSrcLocal parameter.

2015-08-12 Thread marmbrus
Repository: spark Updated Branches: refs/heads/branch-1.5 4c6b1296d -> e9641f192 [SPARK-9804] [HIVE] Use correct value for isSrcLocal parameter. If the correct parameter is not provided, Hive will run into an error because it calls methods that are specific to the local filesystem to copy the

spark git commit: [SPARK-9804] [HIVE] Use correct value for isSrcLocal parameter.

2015-08-12 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master e0110792e -> 57ec27dd7 [SPARK-9804] [HIVE] Use correct value for isSrcLocal parameter. If the correct parameter is not provided, Hive will run into an error because it calls methods that are specific to the local filesystem to copy the data

spark git commit: [SPARK-8890] [SQL] Fallback on sorting when writing many dynamic partitions

2015-08-07 Thread marmbrus
be overridden by internal datasources to avoid the conversion. This change remove a lot of code duplication and per-row `asInstanceOf` checks. - `commands.scala` has been split up. Author: Michael Armbrust Closes #8010 from marmbrus/fsWriting and squashes the following commits: 00804fe [Mich

spark git commit: [SPARK-8890] [SQL] Fallback on sorting when writing many dynamic partitions

2015-08-07 Thread marmbrus
hod can be overridden by internal datasources to avoid the conversion. This change remove a lot of code duplication and per-row `asInstanceOf` checks. - `commands.scala` has been split up. Author: Michael Armbrust Closes #8010 from marmbrus/fsWriting and squashes the following commits: 0080

spark git commit: [SPARK-8382] [SQL] Improve Analysis Unit test framework

2015-08-07 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 76eaa7018 -> 2432c2e23 [SPARK-8382] [SQL] Improve Analysis Unit test framework Author: Wenchen Fan Closes #8025 from cloud-fan/analysis and squashes the following commits: 51461b1 [Wenchen Fan] move test file to test folder ec88ace [Wenc

spark git commit: [SPARK-8382] [SQL] Improve Analysis Unit test framework

2015-08-07 Thread marmbrus
Repository: spark Updated Branches: refs/heads/branch-1.5 6c2f30c10 -> ff0abca2b [SPARK-8382] [SQL] Improve Analysis Unit test framework Author: Wenchen Fan Closes #8025 from cloud-fan/analysis and squashes the following commits: 51461b1 [Wenchen Fan] move test file to test folder ec88ace [

spark git commit: [SPARK-9211] [SQL] [TEST] normalize line separators before generating MD5 hash

2015-08-06 Thread marmbrus
Repository: spark Updated Branches: refs/heads/branch-1.5 ee43d355b -> 990b4bf7c [SPARK-9211] [SQL] [TEST] normalize line separators before generating MD5 hash The golden answer file names for the existing Hive comparison tests were generated using a MD5 hash of the query text which uses Unix

spark git commit: [SPARK-9211] [SQL] [TEST] normalize line separators before generating MD5 hash

2015-08-06 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 54c0789a0 -> abfedb9cd [SPARK-9211] [SQL] [TEST] normalize line separators before generating MD5 hash The golden answer file names for the existing Hive comparison tests were generated using a MD5 hash of the query text which uses Unix-sty

spark git commit: [SPARK-7119] [SQL] Give script a default serde with the user specific types

2015-08-04 Thread marmbrus
ng.Thread.run(Thread.java:722) chenghao-intel marmbrus Author: zhichao.li Closes #6638 from zhichao-li/transDataType2 and squashes the following commits: a36cc7c [zhichao.li] style b9252a8 [zhichao.li] delete cacheRow f6968a4 [zhichao.li] give script a default serde (cherry picke

spark git commit: [SPARK-7119] [SQL] Give script a default serde with the user specific types

2015-08-04 Thread marmbrus
ng.Thread.run(Thread.java:722) chenghao-intel marmbrus Author: zhichao.li Closes #6638 from zhichao-li/transDataType2 and squashes the following commits: a36cc7c [zhichao.li] style b9252a8 [zhichao.li] delete cacheRow f6968a4 [zhichao.li] give script a default serde Project: http://git-wip-us.apa

spark git commit: [SPARK-9606] [SQL] Ignore flaky thrift server tests

2015-08-04 Thread marmbrus
Repository: spark Updated Branches: refs/heads/branch-1.5 c5250ddc5 -> be37b1bd3 [SPARK-9606] [SQL] Ignore flaky thrift server tests Author: Michael Armbrust Closes #7939 from marmbrus/turnOffThriftTests and squashes the following commits: 80d618e [Michael Armbrust] [SPARK-9606][

spark git commit: [SPARK-9606] [SQL] Ignore flaky thrift server tests

2015-08-04 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 5a23213c1 -> a0cc01759 [SPARK-9606] [SQL] Ignore flaky thrift server tests Author: Michael Armbrust Closes #7939 from marmbrus/turnOffThriftTests and squashes the following commits: 80d618e [Michael Armbrust] [SPARK-9606][SQL] Ign

spark git commit: [SPARK-9512][SQL] Revert SPARK-9251, Allow evaluation while sorting

2015-08-04 Thread marmbrus
ael Armbrust Closes #7906 from marmbrus/revertSortProjection and squashes the following commits: 2da6972 [Michael Armbrust] unrevert unrelated changes 4f2b00c [Michael Armbrust] Revert "[SPARK-9251][SQL] do not order by expressions which still need evaluation" (cherry picke

spark git commit: [SPARK-9512][SQL] Revert SPARK-9251, Allow evaluation while sorting

2015-08-04 Thread marmbrus
ael Armbrust Closes #7906 from marmbrus/revertSortProjection and squashes the following commits: 2da6972 [Michael Armbrust] unrevert unrelated changes 4f2b00c [Michael Armbrust] Revert "[SPARK-9251][SQL] do not order by expressions which still need evaluation" Project: http://git-wip-

[2/3] spark git commit: [SPARK-8064] [SQL] Build against Hive 1.2.1

2015-08-03 Thread marmbrus
http://git-wip-us.apache.org/repos/asf/spark/blob/a2409d1c/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveContext.scala -- diff --git a/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveContext.scala b/sql/hive/src/mai

[2/3] spark git commit: [SPARK-8064] [SQL] Build against Hive 1.2.1

2015-08-03 Thread marmbrus
http://git-wip-us.apache.org/repos/asf/spark/blob/6bd12e81/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveContext.scala -- diff --git a/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveContext.scala b/sql/hive/src/mai

[3/3] spark git commit: [SPARK-8064] [SQL] Build against Hive 1.2.1

2015-08-03 Thread marmbrus
[SPARK-8064] [SQL] Build against Hive 1.2.1 Cherry picked the parts of the initial SPARK-8064 WiP branch needed to get sql/hive to compile against hive 1.2.1. That's the ASF release packaged under org.apache.hive, not any fork. Tests not run yet: that's what the machines are for Author: Steve

[3/3] spark git commit: [SPARK-8064] [SQL] Build against Hive 1.2.1

2015-08-03 Thread marmbrus
[SPARK-8064] [SQL] Build against Hive 1.2.1 Cherry picked the parts of the initial SPARK-8064 WiP branch needed to get sql/hive to compile against hive 1.2.1. That's the ASF release packaged under org.apache.hive, not any fork. Tests not run yet: that's what the machines are for Author: Steve

[1/3] spark git commit: [SPARK-8064] [SQL] Build against Hive 1.2.1

2015-08-03 Thread marmbrus
Repository: spark Updated Branches: refs/heads/branch-1.5 db5832708 -> 6bd12e819 http://git-wip-us.apache.org/repos/asf/spark/blob/6bd12e81/sql/hive/src/test/resources/golden/parenthesis_star_by-5-6888c7f7894910538d82eefa23443189 -

[1/3] spark git commit: [SPARK-8064] [SQL] Build against Hive 1.2.1

2015-08-03 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master b2e4b85d2 -> a2409d1c8 http://git-wip-us.apache.org/repos/asf/spark/blob/a2409d1c/sql/hive/src/test/resources/golden/parenthesis_star_by-5-6888c7f7894910538d82eefa23443189 -

spark git commit: [SPARK-9511] [SQL] Fixed Table Name Parsing

2015-08-03 Thread marmbrus
Repository: spark Updated Branches: refs/heads/branch-1.5 b41a32718 -> 4de833e9e [SPARK-9511] [SQL] Fixed Table Name Parsing The issue was that the tokenizer was parsing "1one" into the numeric 1 using the code on line 110. I added another case to accept strings that start with a number and

spark git commit: [SPARK-9511] [SQL] Fixed Table Name Parsing

2015-08-03 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master b41a32718 -> dfe7bd168 [SPARK-9511] [SQL] Fixed Table Name Parsing The issue was that the tokenizer was parsing "1one" into the numeric 1 using the code on line 110. I added another case to accept strings that start with a number and then

spark git commit: [SPARK-9397] DataFrame should provide an API to find source data files if applicable

2015-07-28 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 9bbe0171c -> 35ef853b3 [SPARK-9397] DataFrame should provide an API to find source data files if applicable Certain applications would benefit from being able to inspect DataFrames that are straightforwardly produced by data sources that

spark git commit: [SPARK-8828] [SQL] Revert SPARK-5680

2015-07-27 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 3bc7055e2 -> 63a492b93 [SPARK-8828] [SQL] Revert SPARK-5680 JIRA: https://issues.apache.org/jira/browse/SPARK-8828 Author: Yijie Shen Closes #7667 from yjshen/revert_combinesum_2 and squashes the following commits: c37ccb1 [Yijie Shen]

spark git commit: [SPARK-9351] [SQL] remove literals from grouping expressions in Aggregate

2015-07-27 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 1f7b3d9dc -> dd9ae7945 [SPARK-9351] [SQL] remove literals from grouping expressions in Aggregate literals in grouping expressions have no effect at all, only make our grouping key bigger, so we should remove them in Optimizer. I also make

spark git commit: [SPARK-9292] Analysis should check that join conditions' data types are BooleanType

2015-07-24 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master c8d71a418 -> c2b50d693 [SPARK-9292] Analysis should check that join conditions' data types are BooleanType This patch adds an analysis check to ensure that join conditions' data types are BooleanType. This check is necessary in order to r

spark git commit: [SPARK-9165] [SQL] codegen for CreateArray, CreateStruct and CreateNamedStruct

2015-07-22 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 76520955f -> 86f80e2b4 [SPARK-9165] [SQL] codegen for CreateArray, CreateStruct and CreateNamedStruct JIRA: https://issues.apache.org/jira/browse/SPARK-9165 Author: Yijie Shen Closes #7537 from yjshen/array_struct_codegen and squashes th

spark git commit: [SPARK-9154] [SQL] codegen StringFormat

2015-07-21 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master c07838b5a -> d4c7a7a36 [SPARK-9154] [SQL] codegen StringFormat Jira: https://issues.apache.org/jira/browse/SPARK-9154 fixes bug of #7546 marmbrus I can't reopen the other PR, because I didn't closed it. Can you trigger Jen

spark git commit: [SPARK-9206] [SQL] Fix HiveContext classloading for GCS connector.

2015-07-21 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 60c0ce134 -> c07838b5a [SPARK-9206] [SQL] Fix HiveContext classloading for GCS connector. IsolatedClientLoader.isSharedClass includes all of com.google.\*, presumably for Guava, protobuf, and/or other shared Google libraries, but needs to c

spark git commit: Revert "[SPARK-9154] [SQL] codegen StringFormat"

2015-07-21 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 89db3c0b6 -> 87d890cc1 Revert "[SPARK-9154] [SQL] codegen StringFormat" This reverts commit 7f072c3d5ec50c65d76bd9f28fac124fce96a89e. Revert #7546 Author: Michael Armbrust Closes #7570 from marmbrus/revert9154 and

spark git commit: [SPARK-9154] [SQL] codegen StringFormat

2015-07-21 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master d45355ee2 -> 7f072c3d5 [SPARK-9154] [SQL] codegen StringFormat Jira: https://issues.apache.org/jira/browse/SPARK-9154 Author: Tarek Auel Closes #7546 from tarekauel/SPARK-9154 and squashes the following commits: a943d3e [Tarek Auel] [SP

spark git commit: [SPARK-9164] [SQL] codegen hex/unhex

2015-07-20 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master e90543e53 -> 936a96cb3 [SPARK-9164] [SQL] codegen hex/unhex Jira: https://issues.apache.org/jira/browse/SPARK-9164 The diff looks heavy, but I just moved the `hex` and `unhex` methods to `object Hex`. This allows me to call them from `ev

spark git commit: [SPARK-8125] [SQL] Accelerates Parquet schema merging and partition discovery

2015-07-20 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master dac7dbf5a -> a1064df0e [SPARK-8125] [SQL] Accelerates Parquet schema merging and partition discovery This PR tries to accelerate Parquet schema discovery and `HadoopFsRelation` partition discovery. The acceleration is done by the followin

spark git commit: [SPARK-6910] [SQL] Support for pushing predicates down to metastore for partition pruning

2015-07-20 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 9f913c4fd -> dde0e12f3 [SPARK-6910] [SQL] Support for pushing predicates down to metastore for partition pruning This PR forks PR #7421 authored by piaozhexiu and adds [a workaround] [1] for fixing the occasional test failures occurred in

spark git commit: [SPARK-7026] [SQL] fix left semi join with equi key and non-equi condition

2015-07-17 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master b13ef7723 -> 170723860 [SPARK-7026] [SQL] fix left semi join with equi key and non-equi condition When the `condition` extracted by `ExtractEquiJoinKeys` contain join Predicate for left semi join, we can not plan it as semiJoin. Such as

spark git commit: [SPARK-9117] [SQL] fix BooleanSimplification in case-insensitive

2015-07-17 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master fd6b3101f -> bd903ee89 [SPARK-9117] [SQL] fix BooleanSimplification in case-insensitive Author: Wenchen Fan Closes #7452 from cloud-fan/boolean-simplify and squashes the following commits: 2a6e692 [Wenchen Fan] fix style d3cfd26 [Wenchen

spark git commit: [SPARK-9113] [SQL] enable analysis check code for self join

2015-07-17 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 15fc2ffe5 -> fd6b3101f [SPARK-9113] [SQL] enable analysis check code for self join The check was unreachable before, as `case operator: LogicalPlan` catches everything already. Author: Wenchen Fan Closes #7449 from cloud-fan/tmp and squ

spark git commit: [SPARK-9142] [SQL] Removing unnecessary self types in Catalyst.

2015-07-17 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 42d8a012f -> b2aa490bb [SPARK-9142] [SQL] Removing unnecessary self types in Catalyst. Just a small change to add Product type to the base expression/plan abstract classes, based on suggestions on #7434 and offline discussions. Author: Re

spark git commit: [SPARK-9027] [SQL] Generalize metastore predicate pushdown

2015-07-14 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 59d820aa8 -> 37f2d9635 [SPARK-9027] [SQL] Generalize metastore predicate pushdown Add support for pushing down metastore filters that are in different orders and add some unit tests. Author: Michael Armbrust Closes #7386 from marmb

spark git commit: [SPARK-9029] [SQL] shortcut CaseKeyWhen if key is null

2015-07-14 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 257236c3e -> 59d820aa8 [SPARK-9029] [SQL] shortcut CaseKeyWhen if key is null Author: Wenchen Fan Closes #7389 from cloud-fan/case-when and squashes the following commits: ea4b6ba [Wenchen Fan] shortcut for case key when Project: http:

spark git commit: [SPARK-6910] [SQL] Support for pushing predicates down to metastore for partition pruning

2015-07-13 Thread marmbrus
table w/ all the partitions into `ParquetRelation` because then `ParquetRelation` can be cached and reused for any query against that table. Please correct me if I am wrong. cc marmbrus Author: Cheolsoo Park Closes #7216 from piaozhexiu/SPARK-6910-2 and squashes the following commits: aa1490f [

spark git commit: [SPARK-8636] [SQL] Fix equalNullSafe comparison

2015-07-13 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 714fc55f4 -> 4c797f2b0 [SPARK-8636] [SQL] Fix equalNullSafe comparison Author: Vinod K C Closes #7040 from vinodkc/fix_CaseKeyWhen_equalNullSafe and squashes the following commits: be5e641 [Vinod K C] Renamed equalNullSafe to threeValue

spark git commit: [SPARK-8783] [SQL] CTAS with WITH clause does not work

2015-07-08 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 2b40365d7 -> f03154378 [SPARK-8783] [SQL] CTAS with WITH clause does not work Currently, CTESubstitution only handles the case that WITH is on the top of the plan. I think it SHOULD handle the case that WITH is child of CTAS. This patch si

spark git commit: [SPARK-5707] [SQL] fix serialization of generated projection

2015-07-08 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 3e831a269 -> 74335b310 [SPARK-5707] [SQL] fix serialization of generated projection Author: Davies Liu Closes #7272 from davies/fix_projection and squashes the following commits: 075ef76 [Davies Liu] fix codegen with BroadcastHashJion

spark git commit: [SPARK-6912] [SQL] Throw an AnalysisException when unsupported Java Map types used in Hive UDF

2015-07-08 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 6722aca80 -> 3e831a269 [SPARK-6912] [SQL] Throw an AnalysisException when unsupported Java Map types used in Hive UDF To make UDF developers understood, throw an exception when unsupported Map types used in Hive UDF. This fix is the same

spark git commit: [SPARK-8794] [SQL] Make PrunedScan work for Sample

2015-07-07 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 3bf20c27f -> da56c4e72 [SPARK-8794] [SQL] Make PrunedScan work for Sample JIRA: https://issues.apache.org/jira/browse/SPARK-8794 Currently `PrunedScan` works only when followed by project or filter operations. However, even if there is a

spark git commit: [SPARK-6747] [SQL] Throw an AnalysisException when unsupported Java list types used in Hive UDF

2015-07-06 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 929dfa24b -> 1821fc165 [SPARK-6747] [SQL] Throw an AnalysisException when unsupported Java list types used in Hive UDF The current implementation can't handle List<> as a return type in Hive UDF and throws meaningless Match Error. We assum

spark git commit: [SPARK-8072] [SQL] Better AnalysisException for writing DataFrame with identically named columns

2015-07-06 Thread marmbrus
ame schema. Function called before storing the dataframe to an external storage. Function added in the corresponding datasource API. cc rxin marmbrus Author: animesh This patch had conflicts when merged, resolved by Committer: Michael Armbrust Closes #7013 from animeshbaranawal/8072 and squas

spark git commit: [SPARK-8588] [SQL] Regression test

2015-07-06 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 0effe180f -> 7b467cc93 [SPARK-8588] [SQL] Regression test This PR adds regression test for https://issues.apache.org/jira/browse/SPARK-8588 (fixed by https://github.com/apache/spark/commit/457d07eaa023b44b75344110508f629925eb6247). Autho

spark git commit: [SPARK-4485] [SQL] 1) Add broadcast hash outer join, (2) Fix SparkPlanTest

2015-07-06 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 37e4d9214 -> 2471c0bf7 [SPARK-4485] [SQL] 1) Add broadcast hash outer join, (2) Fix SparkPlanTest This pull request (1) extracts common functions used by hash outer joins and put it in interface HashOuterJoin (2) adds ShuffledHashOuterJoin

spark git commit: [SPARK-8407] [SQL] complex type constructors: struct and named_struct

2015-07-02 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master afa021e03 -> 52302a803 [SPARK-8407] [SQL] complex type constructors: struct and named_struct This is a follow up of [SPARK-8283](https://issues.apache.org/jira/browse/SPARK-8283) ([PR-6828](https://github.com/apache/spark/pull/6828)), to

spark git commit: [SQL] [MINOR] remove internalRowRDD in DataFrame

2015-07-01 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master fc3a6fe67 -> 0eee06158 [SQL] [MINOR] remove internalRowRDD in DataFrame Developers have already familiar with `queryExecution.toRDD` as internal row RDD, and we should not add new concept. Author: Wenchen Fan Closes #7116 from cloud-fan

spark git commit: [SPARK-8628] [SQL] Race condition in AbstractSparkSQLParser.parse

2015-06-30 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master c1befd780 -> b8e5bb6fc [SPARK-8628] [SQL] Race condition in AbstractSparkSQLParser.parse Made lexical iniatialization as lazy val Author: Vinod K C Closes #7015 from vinodkc/handle_lexical_initialize_schronization and squashes the follo

spark git commit: [SPARK-8628] [SQL] Race condition in AbstractSparkSQLParser.parse

2015-06-30 Thread marmbrus
Repository: spark Updated Branches: refs/heads/branch-1.4 f9cd5cc1b -> 80b0fe200 [SPARK-8628] [SQL] Race condition in AbstractSparkSQLParser.parse Made lexical iniatialization as lazy val Author: Vinod K C Closes #7015 from vinodkc/handle_lexical_initialize_schronization and squashes the f

spark git commit: [SPARK-6785] [SQL] fix DateTimeUtils for dates before 1970

2015-06-30 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master d16a94437 -> 1e1f33997 [SPARK-6785] [SQL] fix DateTimeUtils for dates before 1970 Hi Michael, this Pull-Request is a follow-up to [PR-6242](https://github.com/apache/spark/pull/6242). I removed the two obsolete test cases from the HiveQue

spark git commit: [SPARK-8669] [SQL] Fix crash with BINARY (ENUM) fields with Parquet 1.7

2015-06-29 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master ecacb1e88 -> 4915e9e3b [SPARK-8669] [SQL] Fix crash with BINARY (ENUM) fields with Parquet 1.7 Patch to fix crash with BINARY fields with ENUM original types. Author: Steven She Closes #7048 from stevencanopy/SPARK-8669 and squashes the

spark git commit: [SPARK-8589] [SQL] cleanup DateTimeUtils

2015-06-29 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 4b497a724 -> 881662e9c [SPARK-8589] [SQL] cleanup DateTimeUtils move date time related operations into `DateTimeUtils` and rename some methods to make it more clear. Author: Wenchen Fan Closes #6980 from cloud-fan/datetime and squashes

[1/4] spark git commit: [SPARK-8478] [SQL] Harmonize UDF-related code to use uniformly UDF instead of Udf

2015-06-29 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master c8ae887ef -> 931da5c8a http://git-wip-us.apache.org/repos/asf/spark/blob/931da5c8/sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveUDFSuite.scala -- diff --gi

[3/4] spark git commit: [SPARK-8478] [SQL] Harmonize UDF-related code to use uniformly UDF instead of Udf

2015-06-29 Thread marmbrus
http://git-wip-us.apache.org/repos/asf/spark/blob/931da5c8/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ScalaUdf.scala -- diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/

[2/4] spark git commit: [SPARK-8478] [SQL] Harmonize UDF-related code to use uniformly UDF instead of Udf

2015-06-29 Thread marmbrus
http://git-wip-us.apache.org/repos/asf/spark/blob/931da5c8/sql/core/src/main/scala/org/apache/spark/sql/execution/pythonUDFs.scala -- diff --git a/sql/core/src/main/scala/org/apache/spark/sql/execution/pythonUDFs.scala b/sql/core

[4/4] spark git commit: [SPARK-8478] [SQL] Harmonize UDF-related code to use uniformly UDF instead of Udf

2015-06-29 Thread marmbrus
[SPARK-8478] [SQL] Harmonize UDF-related code to use uniformly UDF instead of Udf Follow-up of #6902 for being coherent between ```Udf``` and ```UDF``` Author: BenFradet Closes #6920 from BenFradet/SPARK-8478 and squashes the following commits: c500f29 [BenFradet] renamed a few variables in f

spark git commit: [SPARK-7862] [SQL] Disable the error message redirect to stderr

2015-06-29 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 637b4eeda -> c6ba2ea34 [SPARK-7862] [SQL] Disable the error message redirect to stderr This is a follow up of #6404, the ScriptTransformation prints the error msg into stderr directly, probably be a disaster for application log. Author: C

spark git commit: [SPARK-8066, SPARK-8067] [hive] Add support for Hive 1.0, 1.1 and 1.2.

2015-06-29 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master ed413bcc7 -> 3664ee25f [SPARK-8066, SPARK-8067] [hive] Add support for Hive 1.0, 1.1 and 1.2. Allow HiveContext to connect to metastores of those versions; some new shims had to be added to account for changing internal APIs. A new test wa

spark git commit: [SPARK-8075] [SQL] apply type check interface to more expressions

2015-06-24 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 7daa70292 -> b71d3254e [SPARK-8075] [SQL] apply type check interface to more expressions a follow up of https://github.com/apache/spark/pull/6405. Note: It's not a big change, a lot of changing is due to I swap some code in `aggregates.sca

spark git commit: [SPARK-7289] handle project -> limit -> sort efficiently

2015-06-24 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master b84d4b4df -> f04b5672c [SPARK-7289] handle project -> limit -> sort efficiently make the `TakeOrdered` strategy and operator more general, such that it can optionally handle a projection when necessary Author: Wenchen Fan Closes #6780 f

spark git commit: [SPARK-7088] [SQL] Fix analysis for 3rd party logical plan.

2015-06-24 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 43e66192f -> b84d4b4df [SPARK-7088] [SQL] Fix analysis for 3rd party logical plan. ResolveReferences analysis rule now does not throw when it cannot resolve references in a self-join. Author: Santiago M. Mola Closes #6853 from smola/SPA

spark git commit: [SPARK-8432] [SQL] fix hashCode() and equals() of BinaryType in Row

2015-06-23 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 7b1450b66 -> 6f4cadf5e [SPARK-8432] [SQL] fix hashCode() and equals() of BinaryType in Row Also added more tests in LiteralExpressionSuite Author: Davies Liu Closes #6876 from davies/fix_hashcode and squashes the following commits: 429c

spark git commit: [SPARK-7235] [SQL] Refactor the grouping sets

2015-06-23 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 4f7fbefb8 -> 7b1450b66 [SPARK-7235] [SQL] Refactor the grouping sets The logical plan `Expand` takes the `output` as constructor argument, which break the references chain. We need to refactor the code, as well as the column pruning. Aut

spark git commit: [SPARK-8300] DataFrame hint for broadcast join.

2015-06-23 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master f0dcbe8a7 -> 6ceb16960 [SPARK-8300] DataFrame hint for broadcast join. Users can now do ```scala left.join(broadcast(right), "joinKey") ``` to give the query planner a hint that "right" DataFrame is small and should be broadcasted. Author

spark git commit: [SPARK-7153] [SQL] support all integral type ordinal in GetArrayItem

2015-06-22 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 1dfb0f7b2 -> 860a49ef2 [SPARK-7153] [SQL] support all integral type ordinal in GetArrayItem first convert `ordinal` to `Number`, then convert to int type. Author: Wenchen Fan Closes #5706 from cloud-fan/7153 and squashes the following co

spark git commit: [SPARK-8356] [SQL] Reconcile callUDF and callUdf

2015-06-22 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master b1f3a489e -> 50d3242d6 [SPARK-8356] [SQL] Reconcile callUDF and callUdf Deprecates ```callUdf``` in favor of ```callUDF```. Author: BenFradet Closes #6902 from BenFradet/SPARK-8356 and squashes the following commits: ef4e9d8 [BenFradet]

spark git commit: [SPARK-8104] [SQL] auto alias expressions in analyzer

2015-06-22 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 5d89d9f00 -> da7bbb943 [SPARK-8104] [SQL] auto alias expressions in analyzer Currently we auto alias expression in parser. However, during parser phase we don't have enough information to do the right alias. For example, Generator that ha

spark git commit: [SPARK-8368] [SPARK-8058] [SQL] HiveContext may override the context class loader of the current thread (branch 1.4)

2015-06-19 Thread marmbrus
Repository: spark Updated Branches: refs/heads/branch-1.4 4b2c793a2 -> 9ac839366 [SPARK-8368] [SPARK-8058] [SQL] HiveContext may override the context class loader of the current thread (branch 1.4) This is for 1.4 branch (based on https://github.com/apache/spark/pull/6891). Author: Yin Huai

spark git commit: [SPARK-8368] [SPARK-8058] [SQL] HiveContext may override the context class loader of the current thread

2015-06-19 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 4be53d039 -> c5876e529 [SPARK-8368] [SPARK-8058] [SQL] HiveContext may override the context class loader of the current thread https://issues.apache.org/jira/browse/SPARK-8368 Also, I add tests according https://issues.apache.org/jira/bro

spark git commit: [SPARK-8446] [SQL] Add helper functions for testing SparkPlan physical operators

2015-06-18 Thread marmbrus
b [Josh Rosen] Provide implicits automatically a80f9b0 [Josh Rosen] Merge pull request #4 from marmbrus/pr/6885 d9ab1e4 [Michael Armbrust] Add simple resolver c60a44d [Josh Rosen] Manually bind references 996332a [Josh Rosen] Add types so that tests compile a46144a [Josh Rosen] WIP (cherry picke

spark git commit: [SPARK-8446] [SQL] Add helper functions for testing SparkPlan physical operators

2015-06-18 Thread marmbrus
osen] Provide implicits automatically a80f9b0 [Josh Rosen] Merge pull request #4 from marmbrus/pr/6885 d9ab1e4 [Michael Armbrust] Add simple resolver c60a44d [Josh Rosen] Manually bind references 996332a [Josh Rosen] Add types so that tests compile a46144a [Josh Rosen] WIP Project: http://git-wip-

spark git commit: [SPARK-8397] [SQL] Allow custom configuration for TestHive

2015-06-17 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master a06d9c8e7 -> d1069cba4 [SPARK-8397] [SQL] Allow custom configuration for TestHive We encourage people to use TestHive in unit tests, because it's impossible to create more than one HiveContext within one process. The current implementation

spark git commit: [SPARK-8306] [SQL] AddJar command needs to set the new class loader to the HiveConf inside executionHive.state.

2015-06-17 Thread marmbrus
Repository: spark Updated Branches: refs/heads/branch-1.4 5aedfa2ce -> 73cf5def0 [SPARK-8306] [SQL] AddJar command needs to set the new class loader to the HiveConf inside executionHive.state. https://issues.apache.org/jira/browse/SPARK-8306 I will try to add a test later. marmb

spark git commit: [SPARK-8306] [SQL] AddJar command needs to set the new class loader to the HiveConf inside executionHive.state.

2015-06-17 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 7f05b1fe6 -> 302556ff9 [SPARK-8306] [SQL] AddJar command needs to set the new class loader to the HiveConf inside executionHive.state. https://issues.apache.org/jira/browse/SPARK-8306 I will try to add a test later. marmbrus aaron

spark git commit: [SPARK-7067] [SQL] fix bug when use complex nested fields in ORDER BY

2015-06-17 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master a411a40de -> 7f05b1fe6 [SPARK-7067] [SQL] fix bug when use complex nested fields in ORDER BY This PR is a improvement for https://github.com/apache/spark/pull/5189. The resolution rule for ORDER BY is: first resolve based on what comes fro

spark git commit: [SPARK-8010] [SQL] Promote types to StringType as implicit conversion in non-binary expression of HiveTypeCoercion

2015-06-17 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master a46594435 -> 98ee3512b [SPARK-8010] [SQL] Promote types to StringType as implicit conversion in non-binary expression of HiveTypeCoercion 1. Given a query `select coalesce(null, 1, '1') from dual` will cause exception: java.lang.RuntimeExc

spark git commit: [SPARK-6782] add sbt-revolver plugin

2015-06-17 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master f005be027 -> a46594435 [SPARK-6782] add sbt-revolver plugin to make it easier to start & stop http servers in sbt https://issues.apache.org/jira/browse/SPARK-6782 Author: Imran Rashid Closes #5426 from squito/SPARK-6782 and squashes the

spark git commit: [SPARK-8077] [SQL] Optimization for TreeNodes with large numbers of children

2015-06-17 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 50a0496a4 -> 0c1b2df04 [SPARK-8077] [SQL] Optimization for TreeNodes with large numbers of children For example large IN clauses Large IN clauses are parsed very slowly. For example SQL below (10K items in IN) takes 45-50s. s"""SELECT *

spark git commit: [SPARK-8156] [SQL] create table to specific database by 'use dbname'

2015-06-16 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master ca998757e -> 0b8c8fdc1 [SPARK-8156] [SQL] create table to specific database by 'use dbname' when i test the following code: hiveContext.sql("""use testdb""") val df = (1 to 3).map(i => (i, s"val_$i", i * 2)).toDF("a", "b", "c") df.write .fo

spark git commit: [SPARK-6583] [SQL] Support aggregate functions in ORDER BY

2015-06-15 Thread marmbrus
ses #5290. Author: Yadong Qi Author: Michael Armbrust Closes #6816 from marmbrus/pr/5290 and squashes the following commits: 3226a97 [Michael Armbrust] consistent ordering eb8938d [Michael Armbrust] no vars c8b25c1 [Yadong Qi] move the test data. 7f9b736 [Yadong Qi] delete Substring case a1e8

spark git commit: [SPARK-8065] [SQL] Add support for Hive 0.14 metastores

2015-06-14 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master f3f2a4397 -> 4eb48ed1d [SPARK-8065] [SQL] Add support for Hive 0.14 metastores This change has two parts. The first one gets rid of "ReflectionMagic". That worked well for the differences between 0.12 and 0.13, but breaks in 0.14, since s

spark git commit: [SPARK-8362] [SQL] Add unit tests for +, -, *, /, %

2015-06-14 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 9073a426e -> 53c16b92a [SPARK-8362] [SQL] Add unit tests for +, -, *, /, % Added unit tests for all supported data types for: - Add - Subtract - Multiply - Divide - UnaryMinus - Remainder Fixed bugs caught by the unit tests. Author: Reyno

<    2   3   4   5   6   7   8   9   10   11   >