spark git commit: [SPARK-22208][SQL] Improve percentile_approx by not rounding up targetError and starting from index 0

2017-10-11 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 76fb173dd -> 655f6f86f [SPARK-22208][SQL] Improve percentile_approx by not rounding up targetError and starting from index 0 ## What changes were proposed in this pull request? Currently percentile_approx never returns the first element w

spark git commit: [SPARK-21751][SQL] CodeGeneraor.splitExpressions counts code size more precisely

2017-10-10 Thread lixiao
Repository: spark Updated Branches: refs/heads/master bd4eb9ce5 -> 76fb173dd [SPARK-21751][SQL] CodeGeneraor.splitExpressions counts code size more precisely ## What changes were proposed in this pull request? Current `CodeGeneraor.splitExpressions` splits statements into methods if the tota

spark git commit: [SPARK-19558][SQL] Add config key to register QueryExecutionListeners automatically.

2017-10-10 Thread lixiao
Repository: spark Updated Branches: refs/heads/master bfc7e1fe1 -> bd4eb9ce5 [SPARK-19558][SQL] Add config key to register QueryExecutionListeners automatically. This change adds a new SQL config key that is equivalent to SparkContext's "spark.extraListeners", allowing users to register Query

spark git commit: rename the file.

2017-10-10 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 23af2d79a -> 633ffd816 rename the file. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/633ffd81 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/633ffd81 Diff:

spark git commit: [SPARK-22159][SQL][FOLLOW-UP] Make config names consistently end with "enabled".

2017-10-09 Thread lixiao
Repository: spark Updated Branches: refs/heads/master bebd2e1ce -> af8a34c78 [SPARK-22159][SQL][FOLLOW-UP] Make config names consistently end with "enabled". ## What changes were proposed in this pull request? This is a follow-up of #19384. In the previous pr, only definitions of the config

spark git commit: [SPARK-22222][CORE] Fix the ARRAY_MAX in BufferHolder and add a test

2017-10-09 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 71c2b81aa -> bebd2e1ce [SPARK-2][CORE] Fix the ARRAY_MAX in BufferHolder and add a test ## What changes were proposed in this pull request? We should not break the assumption that the length of the allocated byte array is word rounded

spark git commit: [SPARK-22170][SQL] Reduce memory consumption in broadcast joins.

2017-10-09 Thread lixiao
Repository: spark Updated Branches: refs/heads/master dadd13f36 -> 155ab6347 [SPARK-22170][SQL] Reduce memory consumption in broadcast joins. ## What changes were proposed in this pull request? This updates the broadcast join code path to lazily decompress pages and iterate through UnsafeRows

spark git commit: [SPARK-22214][SQL] Refactor the list hive partitions code

2017-10-06 Thread lixiao
Repository: spark Updated Branches: refs/heads/master c7b46d4d8 -> 08b204fd2 [SPARK-22214][SQL] Refactor the list hive partitions code ## What changes were proposed in this pull request? In this PR we make a few changes to the list hive partitions code, to make the code more extensible. The

spark git commit: [SPARK-21871][SQL] Fix infinite loop when bytecode size is larger than spark.sql.codegen.hugeMethodLimit

2017-10-05 Thread lixiao
Repository: spark Updated Branches: refs/heads/master ae61f187a -> 83488cc31 [SPARK-21871][SQL] Fix infinite loop when bytecode size is larger than spark.sql.codegen.hugeMethodLimit ## What changes were proposed in this pull request? When exceeding `spark.sql.codegen.hugeMethodLimit`, the run

spark git commit: [SPARK-22169][SQL] support byte length literal as identifier

2017-10-04 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 4a779bdac -> bb035f1ee [SPARK-22169][SQL] support byte length literal as identifier ## What changes were proposed in this pull request? By definition the table name in Spark can be something like `123x`, `25a`, etc., with exceptions for l

spark git commit: [SPARK-21871][SQL] Check actual bytecode size when compiling generated code

2017-10-04 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 64df08b64 -> 4a779bdac [SPARK-21871][SQL] Check actual bytecode size when compiling generated code ## What changes were proposed in this pull request? This pr added code to check actual bytecode size when compiling generated code. In #1881

spark git commit: [SPARK-22171][SQL] Describe Table Extended Failed when Table Owner is Empty

2017-10-03 Thread lixiao
Repository: spark Updated Branches: refs/heads/master e36ec38d8 -> 5f6943345 [SPARK-22171][SQL] Describe Table Extended Failed when Table Owner is Empty ## What changes were proposed in this pull request? Users could hit `java.lang.NullPointerException` when the tables were created by Hive a

spark git commit: [SPARK-22178][SQL] Refresh Persistent Views by REFRESH TABLE Command

2017-10-03 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 4c5158eec -> e65b6b7ca [SPARK-22178][SQL] Refresh Persistent Views by REFRESH TABLE Command ## What changes were proposed in this pull request? The underlying tables of persistent views are not refreshed when users issue the REFRESH TABLE

spark git commit: [SPARK-22178][SQL] Refresh Persistent Views by REFRESH TABLE Command

2017-10-03 Thread lixiao
Repository: spark Updated Branches: refs/heads/branch-2.2 3c30be53b -> 5e3f2544a [SPARK-22178][SQL] Refresh Persistent Views by REFRESH TABLE Command ## What changes were proposed in this pull request? The underlying tables of persistent views are not refreshed when users issue the REFRESH TA

spark git commit: [SPARK-21644][SQL] LocalLimit.maxRows is defined incorrectly

2017-10-03 Thread lixiao
Repository: spark Updated Branches: refs/heads/master fa225da74 -> 4c5158eec [SPARK-21644][SQL] LocalLimit.maxRows is defined incorrectly ## What changes were proposed in this pull request? The definition of `maxRows` in `LocalLimit` operator was simply wrong. This patch introduces a new `max

spark git commit: [SPARK-22158][SQL][BRANCH-2.2] convertMetastore should not ignore table property

2017-10-03 Thread lixiao
Repository: spark Updated Branches: refs/heads/branch-2.2 b9adddb6a -> 3c30be53b [SPARK-22158][SQL][BRANCH-2.2] convertMetastore should not ignore table property ## What changes were proposed in this pull request? >From the beginning, **convertMetastoreOrc** ignores table properties and use

spark git commit: [SPARK-22176][SQL] Fix overflow issue in Dataset.show

2017-10-02 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 4329eb2e7 -> fa225da74 [SPARK-22176][SQL] Fix overflow issue in Dataset.show ## What changes were proposed in this pull request? This pr fixed an overflow issue below in `Dataset.show`: ``` scala> Seq((1, 2), (3, 4)).toDF("a", "b").show(Int

spark git commit: [SPARK-22158][SQL] convertMetastore should not ignore table property

2017-10-02 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 8fab7995d -> e5431f2cf [SPARK-22158][SQL] convertMetastore should not ignore table property ## What changes were proposed in this pull request? >From the beginning, convertMetastoreOrc ignores table properties and use an >empty map instea

spark git commit: [SPARK-22001][ML][SQL] ImputerModel can do withColumn for all input columns at one pass

2017-10-01 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 02c91e03f -> 3ca367083 [SPARK-22001][ML][SQL] ImputerModel can do withColumn for all input columns at one pass ## What changes were proposed in this pull request? SPARK-21690 makes one-pass `Imputer` by parallelizing the computation of al

spark git commit: [SPARK-22122][SQL] Use analyzed logical plans to count input rows in TPCDSQueryBenchmark

2017-09-29 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 530fe6832 -> c6610a997 [SPARK-22122][SQL] Use analyzed logical plans to count input rows in TPCDSQueryBenchmark ## What changes were proposed in this pull request? Since the current code ignores WITH clauses to check input relations in TPC

spark git commit: [SPARK-21904][SQL] Rename tempTables to tempViews in SessionCatalog

2017-09-29 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 472864014 -> 530fe6832 [SPARK-21904][SQL] Rename tempTables to tempViews in SessionCatalog ### What changes were proposed in this pull request? `tempTables` is not right. To be consistent, we need to rename the internal variable names/comm

spark git commit: Revert "[SPARK-22142][BUILD][STREAMING] Move Flume support behind a profile"

2017-09-29 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 9ed7394a6 -> 472864014 Revert "[SPARK-22142][BUILD][STREAMING] Move Flume support behind a profile" This reverts commit a2516f41aef68e39df7f6380fd2618cc148a609e. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://gi

spark git commit: [SPARK-22146] FileNotFoundException while reading ORC files containing special characters

2017-09-29 Thread lixiao
Repository: spark Updated Branches: refs/heads/branch-2.2 ac9a0f692 -> 7bf25e086 [SPARK-22146] FileNotFoundException while reading ORC files containing special characters ## What changes were proposed in this pull request? Reading ORC files containing special characters like '%' fails with a

spark git commit: [SPARK-22161][SQL] Add Impala-modified TPC-DS queries

2017-09-29 Thread lixiao
Repository: spark Updated Branches: refs/heads/branch-2.2 8b2d8385c -> ac9a0f692 [SPARK-22161][SQL] Add Impala-modified TPC-DS queries ## What changes were proposed in this pull request? Added IMPALA-modified TPCDS queries to TPC-DS query suites. - Ref: https://github.com/cloudera/impala-tpc

spark git commit: [SPARK-22161][SQL] Add Impala-modified TPC-DS queries

2017-09-29 Thread lixiao
Repository: spark Updated Branches: refs/heads/master ecbe416ab -> 9ed7394a6 [SPARK-22161][SQL] Add Impala-modified TPC-DS queries ## What changes were proposed in this pull request? Added IMPALA-modified TPCDS queries to TPC-DS query suites. - Ref: https://github.com/cloudera/impala-tpcds-k

spark git commit: [SPARK-22141][FOLLOWUP][SQL] Add comments for the order of batches

2017-09-28 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 161ba7eaa -> 0fa4dbe4f [SPARK-22141][FOLLOWUP][SQL] Add comments for the order of batches ## What changes were proposed in this pull request? Add comments for specifying the position of batch "Check Cartesian Products", as rxin suggested

spark git commit: [SPARK-22146] FileNotFoundException while reading ORC files containing special characters

2017-09-28 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 323806e68 -> 161ba7eaa [SPARK-22146] FileNotFoundException while reading ORC files containing special characters ## What changes were proposed in this pull request? Reading ORC files containing special characters like '%' fails with a Fi

spark git commit: [SPARK-22159][SQL] Make config names consistently end with "enabled".

2017-09-28 Thread lixiao
Repository: spark Updated Branches: refs/heads/master d74dee133 -> d29d1e879 [SPARK-22159][SQL] Make config names consistently end with "enabled". ## What changes were proposed in this pull request? spark.sql.execution.arrow.enable and spark.sql.codegen.aggregate.map.twolevel.enable -> enable

spark git commit: [SPARK-22153][SQL] Rename ShuffleExchange -> ShuffleExchangeExec

2017-09-28 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 01bd00d13 -> d74dee133 [SPARK-22153][SQL] Rename ShuffleExchange -> ShuffleExchangeExec ## What changes were proposed in this pull request? For some reason when we added the Exec suffix to all physical operators, we missed this one. I was

spark git commit: [SPARK-22140] Add TPCDSQuerySuite

2017-09-27 Thread lixiao
Repository: spark Updated Branches: refs/heads/branch-2.2 a406473a5 -> 42e172744 [SPARK-22140] Add TPCDSQuerySuite ## What changes were proposed in this pull request? Now, we are not running TPC-DS queries as regular test cases. Thus, we need to add a test suite using empty tables for ensurin

spark git commit: [SPARK-22140] Add TPCDSQuerySuite

2017-09-27 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 02bb0682e -> 9244957b5 [SPARK-22140] Add TPCDSQuerySuite ## What changes were proposed in this pull request? Now, we are not running TPC-DS queries as regular test cases. Thus, we need to add a test suite using empty tables for ensuring th

spark git commit: [SPARK-22103][FOLLOWUP] Rename addExtraCode to addInnerClass

2017-09-26 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 64fbd1cef -> f21f6ce99 [SPARK-22103][FOLLOWUP] Rename addExtraCode to addInnerClass ## What changes were proposed in this pull request? Address PR comments that appeared post-merge, to rename `addExtraCode` to `addInnerClass`, and not cou

spark git commit: [SPARK-22120][SQL] TestHiveSparkSession.reset() should clean out Hive warehouse directory

2017-09-25 Thread lixiao
Repository: spark Updated Branches: refs/heads/branch-2.2 9836ea19f -> b0f30b56a [SPARK-22120][SQL] TestHiveSparkSession.reset() should clean out Hive warehouse directory ## What changes were proposed in this pull request? During TestHiveSparkSession.reset(), which is called after each TestH

spark git commit: [SPARK-22120][SQL] TestHiveSparkSession.reset() should clean out Hive warehouse directory

2017-09-25 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 038b18573 -> ce204780e [SPARK-22120][SQL] TestHiveSparkSession.reset() should clean out Hive warehouse directory ## What changes were proposed in this pull request? During TestHiveSparkSession.reset(), which is called after each TestHiveS

spark git commit: [SPARK-22103] Move HashAggregateExec parent consume to a separate function in codegen

2017-09-25 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 2c5b9b117 -> 038b18573 [SPARK-22103] Move HashAggregateExec parent consume to a separate function in codegen ## What changes were proposed in this pull request? HashAggregateExec codegen uses two paths for fast hash table and a generic on

spark git commit: [SPARK-22100][SQL] Make percentile_approx support date/timestamp type and change the output type to be the same as input type

2017-09-25 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 20adf9aa1 -> 365a29bdb [SPARK-22100][SQL] Make percentile_approx support date/timestamp type and change the output type to be the same as input type ## What changes were proposed in this pull request? The `percentile_approx` function prev

spark git commit: [SPARK-21338][SQL][FOLLOW-UP] Implement isCascadingTruncateTable() method in AggregatedDialect

2017-09-23 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 4a8c9e29b -> 2274d84ef [SPARK-21338][SQL][FOLLOW-UP] Implement isCascadingTruncateTable() method in AggregatedDialect ## What changes were proposed in this pull request? The implemented `isCascadingTruncateTable` in `AggregatedDialect` is

spark git commit: [SPARK-22110][SQL][DOCUMENTATION] Add usage and improve documentation with arguments and examples for trim function

2017-09-23 Thread lixiao
Repository: spark Updated Branches: refs/heads/master c792aff03 -> 4a8c9e29b [SPARK-22110][SQL][DOCUMENTATION] Add usage and improve documentation with arguments and examples for trim function ## What changes were proposed in this pull request? This PR proposes to enhance the documentation f

spark git commit: [SPARK-21998][SQL] SortMergeJoinExec did not calculate its outputOrdering correctly during physical planning

2017-09-21 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 5ac96854c -> 5960686e7 [SPARK-21998][SQL] SortMergeJoinExec did not calculate its outputOrdering correctly during physical planning ## What changes were proposed in this pull request? Right now the calculation of SortMergeJoinExec's outpu

spark git commit: [SPARK-22088][SQL] Incorrect scalastyle comment causes wrong styles in stringExpressions

2017-09-21 Thread lixiao
Repository: spark Updated Branches: refs/heads/master f7ad0dbd5 -> 9cac249fd [SPARK-22088][SQL] Incorrect scalastyle comment causes wrong styles in stringExpressions ## What changes were proposed in this pull request? There is an incorrect `scalastyle:on` comment in `stringExpressions.scala`

spark git commit: [SPARK-22076][SQL][FOLLOWUP] Expand.projections should not be a Stream

2017-09-20 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 55d5fa79d -> 352bea545 [SPARK-22076][SQL][FOLLOWUP] Expand.projections should not be a Stream ## What changes were proposed in this pull request? This a follow-up of https://github.com/apache/spark/pull/19289 , we missed another place: `r

spark git commit: [SPARK-22076][SQL] Expand.projections should not be a Stream

2017-09-20 Thread lixiao
Repository: spark Updated Branches: refs/heads/branch-2.2 6764408f6 -> 5d10586a0 [SPARK-22076][SQL] Expand.projections should not be a Stream ## What changes were proposed in this pull request? Spark with Scala 2.10 fails with a group by cube: ``` spark.range(1).select($"id" as "a", $"id" as

spark git commit: [SPARK-22076][SQL] Expand.projections should not be a Stream

2017-09-20 Thread lixiao
Repository: spark Updated Branches: refs/heads/master e17901d6d -> ce6a71e01 [SPARK-22076][SQL] Expand.projections should not be a Stream ## What changes were proposed in this pull request? Spark with Scala 2.10 fails with a group by cube: ``` spark.range(1).select($"id" as "a", $"id" as "b"

spark git commit: [SPARK-19318][SPARK-22041][SPARK-16625][BACKPORT-2.1][SQL] Docker test case failure: `: General data types to be mapped to Oracle`

2017-09-19 Thread lixiao
Repository: spark Updated Branches: refs/heads/branch-2.1 30ce056d8 -> 56865a1e9 [SPARK-19318][SPARK-22041][SPARK-16625][BACKPORT-2.1][SQL] Docker test case failure: `: General data types to be mapped to Oracle` ## What changes were proposed in this pull request? This PR is backport of https

spark git commit: [SPARK-21969][SQL] CommandUtils.updateTableStats should call refreshTable

2017-09-19 Thread lixiao
Repository: spark Updated Branches: refs/heads/master d5aefa83a -> ee13f3e3d [SPARK-21969][SQL] CommandUtils.updateTableStats should call refreshTable ## What changes were proposed in this pull request? Tables in the catalog cache are not invalidated once their statistics are updated. As a c

spark git commit: [SPARK-21338][SQL] implement isCascadingTruncateTable() method in AggregatedDialect

2017-09-19 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 2f962422a -> d5aefa83a [SPARK-21338][SQL] implement isCascadingTruncateTable() method in AggregatedDialect ## What changes were proposed in this pull request? org.apache.spark.sql.jdbc.JdbcDialect's method: def isCascadingTruncateTable():

spark git commit: [SPARK-14878][SQL] Trim characters string function support

2017-09-18 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 3b049abf1 -> c66d64b3d [SPARK-14878][SQL] Trim characters string function support What changes were proposed in this pull request? This PR enhances the TRIM function support in Spark SQL by allowing the specification of trim characte

spark git commit: [SPARK-22003][SQL] support array column in vectorized reader with UDF

2017-09-18 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 894a7561d -> 3b049abf1 [SPARK-22003][SQL] support array column in vectorized reader with UDF ## What changes were proposed in this pull request? The UDF needs to deserialize the `UnsafeRow`. When the column type is Array, the `get` method

spark git commit: [SPARK-21987][SQL] fix a compatibility issue of sql event logs

2017-09-15 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 4decedfdb -> 3c6198c86 [SPARK-21987][SQL] fix a compatibility issue of sql event logs ## What changes were proposed in this pull request? In https://github.com/apache/spark/pull/18600 we removed the `metadata` field from `SparkPlanInfo`.

spark git commit: [SPARK-22002][SQL] Read JDBC table use custom schema support specify partial fields.

2017-09-14 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 22b111ef9 -> 4decedfdb [SPARK-22002][SQL] Read JDBC table use custom schema support specify partial fields. ## What changes were proposed in this pull request? https://github.com/apache/spark/pull/18266 add a new feature to support read

spark git commit: [MINOR][SQL] Only populate type metadata for required types such as CHAR/VARCHAR.

2017-09-13 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 8be7e6bb3 -> dcbb22943 [MINOR][SQL] Only populate type metadata for required types such as CHAR/VARCHAR. ## What changes were proposed in this pull request? When reading column descriptions from hive catalog, we currently populate the met

spark git commit: [SPARK-21973][SQL] Add an new option to filter queries in TPC-DS

2017-09-13 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 17edfec59 -> 8be7e6bb3 [SPARK-21973][SQL] Add an new option to filter queries in TPC-DS ## What changes were proposed in this pull request? This pr added a new option to filter TPC-DS queries to run in `TPCDSQueryBenchmark`. By default, `T

spark git commit: [SPARK-20427][SQL] Read JDBC table use custom schema

2017-09-13 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 8c7e19a37 -> 17edfec59 [SPARK-20427][SQL] Read JDBC table use custom schema ## What changes were proposed in this pull request? Auto generated Oracle schema some times not we expect: - `number(1)` auto mapped to BooleanType, some times it

spark git commit: [SPARK-4131] Merge HiveTmpFile.scala to SaveAsHiveFile.scala

2017-09-13 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 21c4450fb -> 8c7e19a37 [SPARK-4131] Merge HiveTmpFile.scala to SaveAsHiveFile.scala ## What changes were proposed in this pull request? The code is already merged to master: https://github.com/apache/spark/pull/18975 This is a following u

spark git commit: [SPARK-21980][SQL] References in grouping functions should be indexed with semanticEquals

2017-09-13 Thread lixiao
Repository: spark Updated Branches: refs/heads/branch-2.2 b606dc177 -> 3a692e355 [SPARK-21980][SQL] References in grouping functions should be indexed with semanticEquals ## What changes were proposed in this pull request? https://issues.apache.org/jira/browse/SPARK-21980 This PR fixes the

spark git commit: [SPARK-21980][SQL] References in grouping functions should be indexed with semanticEquals

2017-09-13 Thread lixiao
Repository: spark Updated Branches: refs/heads/master b6ef1f57b -> 21c4450fb [SPARK-21980][SQL] References in grouping functions should be indexed with semanticEquals ## What changes were proposed in this pull request? https://issues.apache.org/jira/browse/SPARK-21980 This PR fixes the issu

spark git commit: [SPARK-21979][SQL] Improve QueryPlanConstraints framework

2017-09-12 Thread lixiao
Repository: spark Updated Branches: refs/heads/master c5f9b89dd -> 1a9857476 [SPARK-21979][SQL] Improve QueryPlanConstraints framework ## What changes were proposed in this pull request? Improve QueryPlanConstraints framework, make it robust and simple. In https://github.com/apache/spark/pull

spark git commit: [SPARK-21368][SQL] TPCDSQueryBenchmark can't refer query files.

2017-09-12 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 720c94fe7 -> b9b54b1c8 [SPARK-21368][SQL] TPCDSQueryBenchmark can't refer query files. ## What changes were proposed in this pull request? TPCDSQueryBenchmark packaged into a jar doesn't work with spark-submit. It's because of the failure

spark git commit: [SPARK-17642][SQL] support DESC EXTENDED/FORMATTED table column commands

2017-09-12 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 957558235 -> 515910e9b [SPARK-17642][SQL] support DESC EXTENDED/FORMATTED table column commands ## What changes were proposed in this pull request? Support DESC (EXTENDED | FORMATTED) ? TABLE COLUMN command. Support DESC EXTENDED | FORMATT

spark git commit: [SPARK-21610][SQL] Corrupt records are not handled properly when creating a dataframe from a file

2017-09-10 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 520d92a19 -> 6273a711b [SPARK-21610][SQL] Corrupt records are not handled properly when creating a dataframe from a file ## What changes were proposed in this pull request? ``` echo '{"field": 1} {"field": 2} {"field": "3"}' >/tmp/sample.j

[2/2] spark git commit: [SPARK-4131] Support "Writing data into the filesystem from queries"

2017-09-09 Thread lixiao
[SPARK-4131] Support "Writing data into the filesystem from queries" ## What changes were proposed in this pull request? This PR implements the sql feature: INSERT OVERWRITE [LOCAL] DIRECTORY directory1 [ROW FORMAT row_format] [STORED AS file_format] SELECT ... FROM ... ## How was this patch

[1/2] spark git commit: [SPARK-4131] Support "Writing data into the filesystem from queries"

2017-09-09 Thread lixiao
Repository: spark Updated Branches: refs/heads/master e4d8f9a36 -> f76790557 http://git-wip-us.apache.org/repos/asf/spark/blob/f7679055/sql/hive/src/test/scala/org/apache/spark/sql/hive/InsertSuite.scala -- diff --git a/sql/hi

spark git commit: [MINOR][SQL] Correct DataFrame doc.

2017-09-09 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 6b45d7e94 -> e4d8f9a36 [MINOR][SQL] Correct DataFrame doc. ## What changes were proposed in this pull request? Correct DataFrame doc. ## How was this patch tested? Only doc change, no tests. Author: Yanbo Liang Closes #19173 from yanbol

spark git commit: [SPARK-21941] Stop storing unused attemptId in SQLTaskMetrics

2017-09-08 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 31c74fec2 -> 8a5eb5068 [SPARK-21941] Stop storing unused attemptId in SQLTaskMetrics ## What changes were proposed in this pull request? In a driver heap dump containing 390,105 instances of SQLTaskMetrics this would have saved me approxim

spark git commit: [SPARK-21946][TEST] fix flaky test: "alter table: rename cached table" in InMemoryCatalogedDDLSuite

2017-09-08 Thread lixiao
Repository: spark Updated Branches: refs/heads/branch-2.2 08cb06af2 -> 9ae7c96ce [SPARK-21946][TEST] fix flaky test: "alter table: rename cached table" in InMemoryCatalogedDDLSuite ## What changes were proposed in this pull request? This PR fixes flaky test `InMemoryCatalogedDDLSuite "alter

spark git commit: [SPARK-21946][TEST] fix flaky test: "alter table: rename cached table" in InMemoryCatalogedDDLSuite

2017-09-08 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 0dfc1ec59 -> 8a4f228dc [SPARK-21946][TEST] fix flaky test: "alter table: rename cached table" in InMemoryCatalogedDDLSuite ## What changes were proposed in this pull request? This PR fixes flaky test `InMemoryCatalogedDDLSuite "alter tabl

spark git commit: [SPARK-21936][SQL][2.2] backward compatibility test framework for HiveExternalCatalog

2017-09-08 Thread lixiao
Repository: spark Updated Branches: refs/heads/branch-2.2 781a1f83c -> 08cb06af2 [SPARK-21936][SQL][2.2] backward compatibility test framework for HiveExternalCatalog backport https://github.com/apache/spark/pull/19148 to 2.2 Author: Wenchen Fan Closes #19163 from cloud-fan/test. Project

spark git commit: [SPARK-21936][SQL] backward compatibility test framework for HiveExternalCatalog

2017-09-07 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 6e37524a1 -> dbb824125 [SPARK-21936][SQL] backward compatibility test framework for HiveExternalCatalog ## What changes were proposed in this pull request? `HiveExternalCatalog` is a semi-public interface. When creating tables, `HiveExter

spark git commit: [SPARK-21726][SQL] Check for structural integrity of the plan in Optimzer in test mode.

2017-09-07 Thread lixiao
Repository: spark Updated Branches: refs/heads/master f62b20f39 -> 6e37524a1 [SPARK-21726][SQL] Check for structural integrity of the plan in Optimzer in test mode. ## What changes were proposed in this pull request? We have many optimization rules now in `Optimzer`. Right now we don't have

spark git commit: [SPARK-21949][TEST] Tables created in unit tests should be dropped after use

2017-09-07 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 57bc1e9eb -> f62b20f39 [SPARK-21949][TEST] Tables created in unit tests should be dropped after use ## What changes were proposed in this pull request? Tables should be dropped after use in unit tests. ## How was this patch tested? N/A Au

spark git commit: [SPARK-13656][SQL] Delete spark.sql.parquet.cacheMetadata from SQLConf and docs

2017-09-07 Thread lixiao
Repository: spark Updated Branches: refs/heads/master b9ab791a9 -> e00f1a1da [SPARK-13656][SQL] Delete spark.sql.parquet.cacheMetadata from SQLConf and docs ## What changes were proposed in this pull request? Since [SPARK-15639](https://github.com/apache/spark/pull/13701), `spark.sql.parquet

spark git commit: [SPARK-21912][SQL] ORC/Parquet table should not create invalid column names

2017-09-06 Thread lixiao
Repository: spark Updated Branches: refs/heads/master ce7293c15 -> eea2b877c [SPARK-21912][SQL] ORC/Parquet table should not create invalid column names ## What changes were proposed in this pull request? Currently, users meet job abortions while creating or altering ORC/Parquet tables with

spark git commit: [SPARK-21835][SQL][FOLLOW-UP] RewritePredicateSubquery should not produce unresolved query plans

2017-09-06 Thread lixiao
Repository: spark Updated Branches: refs/heads/master aad212547 -> ce7293c15 [SPARK-21835][SQL][FOLLOW-UP] RewritePredicateSubquery should not produce unresolved query plans ## What changes were proposed in this pull request? This is a follow-up of #19050 to deal with `ExistenceJoin` case.

spark git commit: [SPARK-21835][SQL] RewritePredicateSubquery should not produce unresolved query plans

2017-09-06 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 64936c14a -> f2e22aebf [SPARK-21835][SQL] RewritePredicateSubquery should not produce unresolved query plans ## What changes were proposed in this pull request? Correlated predicate subqueries are rewritten into `Join` by the rule `Rewri

spark git commit: [MINOR][DOC] Update `Partition Discovery` section to enumerate all available file sources

2017-09-05 Thread lixiao
Repository: spark Updated Branches: refs/heads/branch-2.2 1f7c4869b -> 7da8fbf08 [MINOR][DOC] Update `Partition Discovery` section to enumerate all available file sources ## What changes were proposed in this pull request? All built-in data sources support `Partition Discovery`. We had bette

spark git commit: [MINOR][DOC] Update `Partition Discovery` section to enumerate all available file sources

2017-09-05 Thread lixiao
Repository: spark Updated Branches: refs/heads/master fd60d4fa6 -> 9e451bcf3 [MINOR][DOC] Update `Partition Discovery` section to enumerate all available file sources ## What changes were proposed in this pull request? All built-in data sources support `Partition Discovery`. We had better up

spark git commit: [SPARK-21652][SQL] Fix rule confliction between InferFiltersFromConstraints and ConstantPropagation

2017-09-05 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 8c954d2cd -> fd60d4fa6 [SPARK-21652][SQL] Fix rule confliction between InferFiltersFromConstraints and ConstantPropagation ## What changes were proposed in this pull request? For the given example below, the predicate added by `InferFilt

spark git commit: [SPARK-21845][SQL][TEST-MAVEN] Make codegen fallback of expressions configurable

2017-09-05 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 02a4386ae -> 2974406d1 [SPARK-21845][SQL][TEST-MAVEN] Make codegen fallback of expressions configurable ## What changes were proposed in this pull request? We should make codegen fallback of expressions configurable. So far, it is always o

spark git commit: [SPARK-21913][SQL][TEST] withDatabase` should drop database with CASCADE

2017-09-05 Thread lixiao
Repository: spark Updated Branches: refs/heads/master ca59445ad -> 4e7a29efd [SPARK-21913][SQL][TEST] withDatabase` should drop database with CASCADE ## What changes were proposed in this pull request? Currently, `withDatabase` fails if the database is not empty. It would be great if we drop

spark git commit: [SPARK-21654][SQL] Complement SQL predicates expression description

2017-09-03 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 07fd68a29 -> 9f30d9280 [SPARK-21654][SQL] Complement SQL predicates expression description ## What changes were proposed in this pull request? SQL predicates don't have complete expression description. This patch goes to complement the de

spark git commit: [SPARK-21891][SQL] Add TBLPROPERTIES to DDL statement: CREATE TABLE USING

2017-09-02 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 900f14f6f -> acb7fed23 [SPARK-21891][SQL] Add TBLPROPERTIES to DDL statement: CREATE TABLE USING ## What changes were proposed in this pull request? Add `TBLPROPERTIES` to the DDL statement `CREATE TABLE USING`. After this change, the DDL

spark git commit: [SPARK-21884][SPARK-21477][BACKPORT-2.2][SQL] Mark LocalTableScanExec's input data transient

2017-09-01 Thread lixiao
Repository: spark Updated Branches: refs/heads/branch-2.2 14054ffc5 -> 50f86e1fe [SPARK-21884][SPARK-21477][BACKPORT-2.2][SQL] Mark LocalTableScanExec's input data transient This PR is to backport https://github.com/apache/spark/pull/18686 for resolving the issue in https://github.com/apache

spark git commit: [SPARK-21895][SQL] Support changing database in HiveClient

2017-09-01 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 12ab7f7e8 -> aba9492d2 [SPARK-21895][SQL] Support changing database in HiveClient ## What changes were proposed in this pull request? Supporting moving tables across different database in HiveClient `alterTable` ## How was this patch teste

spark git commit: [SPARK-21110][SQL] Structs, arrays, and other orderable datatypes should be usable in inequalities

2017-08-31 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 7ce110828 -> cba69aeb4 [SPARK-21110][SQL] Structs, arrays, and other orderable datatypes should be usable in inequalities ## What changes were proposed in this pull request? Allows `BinaryComparison` operators to work on any data type tha

spark git commit: [SPARK-17107][SQL][FOLLOW-UP] Remove redundant pushdown rule for Union

2017-08-31 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 501370d9d -> 7ce110828 [SPARK-17107][SQL][FOLLOW-UP] Remove redundant pushdown rule for Union ## What changes were proposed in this pull request? Also remove useless function `partitionByDeterministic` after the changes of https://github.c

spark git commit: [SPARK-21583][HOTFIX] Removed intercept in test causing failures

2017-08-31 Thread lixiao
Repository: spark Updated Branches: refs/heads/master fc45c2c88 -> 501370d9d [SPARK-21583][HOTFIX] Removed intercept in test causing failures Removing a check in the ColumnarBatchSuite that depended on a Java assertion. This assertion is being compiled out in the Maven builds causing the tes

spark git commit: [SPARK-21886][SQL] Use SparkSession.internalCreateDataFrame to create…

2017-08-31 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 19b0240d4 -> 9696580c3 [SPARK-21886][SQL] Use SparkSession.internalCreateDataFrame to create… … Dataset with LogicalRDD logical operator ## What changes were proposed in this pull request? Reusing `SparkSession.internalCreateDataFrame

spark git commit: [SPARK-21878][SQL][TEST] Create SQLMetricsTestUtils

2017-08-31 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 964b507c7 -> 19b0240d4 [SPARK-21878][SQL][TEST] Create SQLMetricsTestUtils ## What changes were proposed in this pull request? Creates `SQLMetricsTestUtils` for the utility functions of both Hive-specific and the other SQLMetrics test case

spark git commit: [MINOR][SQL][TEST] Test shuffle hash join while is not expected

2017-08-30 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 32d6d9d72 -> 235d28333 [MINOR][SQL][TEST] Test shuffle hash join while is not expected ## What changes were proposed in this pull request? igore("shuffle hash join") is to shuffle hash join to test _case class ShuffledHashJoinExec_. But w

spark git commit: Revert "[SPARK-21845][SQL] Make codegen fallback of expressions configurable"

2017-08-30 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 4133c1b0a -> 32d6d9d72 Revert "[SPARK-21845][SQL] Make codegen fallback of expressions configurable" This reverts commit 3d0e174244bc293f11dff0f11ef705ba6cd5fe3a. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://g

spark git commit: [SPARK-21845][SQL] Make codegen fallback of expressions configurable

2017-08-29 Thread lixiao
Repository: spark Updated Branches: refs/heads/master fba9cc846 -> 3d0e17424 [SPARK-21845][SQL] Make codegen fallback of expressions configurable ## What changes were proposed in this pull request? We should make codegen fallback of expressions configurable. So far, it is always on. We might

spark git commit: [SPARK-21255][SQL] simplify encoder for java enum

2017-08-29 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 8fcbda9c9 -> 6327ea570 [SPARK-21255][SQL] simplify encoder for java enum ## What changes were proposed in this pull request? This is a follow-up for https://github.com/apache/spark/pull/18488, to simplify the code. The major change is, w

spark git commit: [SPARK-21848][SQL] Add trait UserDefinedExpression to identify user-defined functions

2017-08-29 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 32fa0b814 -> 8fcbda9c9 [SPARK-21848][SQL] Add trait UserDefinedExpression to identify user-defined functions ## What changes were proposed in this pull request? Add trait UserDefinedExpression to identify user-defined functions. UDF can b

spark git commit: [SPARK-21831][TEST] Remove `spark.sql.hive.convertMetastoreOrc` config in HiveCompatibilitySuite

2017-08-25 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 1a598d717 -> 522e1f80d [SPARK-21831][TEST] Remove `spark.sql.hive.convertMetastoreOrc` config in HiveCompatibilitySuite ## What changes were proposed in this pull request? [SPARK-19025](https://github.com/apache/spark/pull/16869) removes

spark git commit: [SPARK-21837][SQL][TESTS] UserDefinedTypeSuite Local UDTs not actually testing what it intends

2017-08-25 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 51620e288 -> 1a598d717 [SPARK-21837][SQL][TESTS] UserDefinedTypeSuite Local UDTs not actually testing what it intends ## What changes were proposed in this pull request? Adjust Local UDTs test to assert about results, and fix index of vec

spark git commit: [SPARK-21756][SQL] Add JSON option to allow unquoted control characters

2017-08-25 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 628bdeabd -> 51620e288 [SPARK-21756][SQL] Add JSON option to allow unquoted control characters ## What changes were proposed in this pull request? This patch adds allowUnquotedControlChars option in JSON data source to allow JSON Strings

spark git commit: [SPARK-21832][TEST] Merge SQLBuilderTest into ExpressionSQLBuilderSuite

2017-08-25 Thread lixiao
Repository: spark Updated Branches: refs/heads/master de7af295c -> 1f24ceee6 [SPARK-21832][TEST] Merge SQLBuilderTest into ExpressionSQLBuilderSuite ## What changes were proposed in this pull request? After [SPARK-19025](https://github.com/apache/spark/pull/16869), there is no need to keep S

spark git commit: [SPARK-21830][SQL] Bump ANTLR version and fix a few issues.

2017-08-24 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 763b83ee8 -> 05af2de0f [SPARK-21830][SQL] Bump ANTLR version and fix a few issues. ## What changes were proposed in this pull request? This PR bumps the ANTLR version to 4.7, and fixes a number of small parser related issues uncovered by t

spark git commit: [SPARK-21826][SQL][2.1][2.0] outer broadcast hash join should not throw NPE

2017-08-24 Thread lixiao
Repository: spark Updated Branches: refs/heads/branch-2.0 9f670ce5d -> bf1f30d7d [SPARK-21826][SQL][2.1][2.0] outer broadcast hash join should not throw NPE backport https://github.com/apache/spark/pull/19036 to branch 2.1 and 2.0 Author: Wenchen Fan Closes #19040 from cloud-fan/bug. (cher

spark git commit: [SPARK-21826][SQL][2.1][2.0] outer broadcast hash join should not throw NPE

2017-08-24 Thread lixiao
Repository: spark Updated Branches: refs/heads/branch-2.1 3d3be4dca -> 576975356 [SPARK-21826][SQL][2.1][2.0] outer broadcast hash join should not throw NPE backport https://github.com/apache/spark/pull/19036 to branch 2.1 and 2.0 Author: Wenchen Fan Closes #19040 from cloud-fan/bug. Proj

<    7   8   9   10   11   12   13   14   15   16   >