[GitHub] spark pull request #12646: [SPARK-14878][SQL] Trim characters string functio...

2017-09-16 Thread kevinyu98
Github user kevinyu98 commented on a diff in the pull request: https://github.com/apache/spark/pull/12646#discussion_r139302558 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala --- @@ -503,69 +504,307 @@ case class

[GitHub] spark issue #19258: add MockNetCat

2017-09-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19258 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #19258: add MockNetCat

2017-09-16 Thread bluejoe2008
GitHub user bluejoe2008 opened a pull request: https://github.com/apache/spark/pull/19258 add MockNetCat ## What changes were proposed in this pull request? I add a MockNetCat class, which avoid manually launch 'nc -lk ' command for test also I put a MockNetCatTest

[GitHub] spark pull request #19249: [SPARK-22032] Speed up StructType.fromInternal

2017-09-16 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/19249#discussion_r139301428 --- Diff: python/pyspark/sql/types.py --- @@ -619,7 +621,8 @@ def fromInternal(self, obj): # it's already converted by pickler

[GitHub] spark issue #19257: [SPARK-22042] [SQL] ReorderJoinPredicates can break when...

2017-09-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19257 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81847/ Test PASSed. ---

[GitHub] spark issue #19257: [SPARK-22042] [SQL] ReorderJoinPredicates can break when...

2017-09-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19257 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19257: [SPARK-22042] [SQL] ReorderJoinPredicates can break when...

2017-09-16 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19257 **[Test build #81847 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81847/testReport)** for PR 19257 at commit

[GitHub] spark issue #19253: [SPARK-22037][SQL] Collapse Project if it is the child o...

2017-09-16 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/19253 The removed `Project` might affect the rule `PhysicalOperation` and reduce the chance to prune columns when reading relations. ---

[GitHub] spark issue #19257: [SPARK-22042] [SQL] ReorderJoinPredicates can break when...

2017-09-16 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19257 **[Test build #81847 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81847/testReport)** for PR 19257 at commit

[GitHub] spark pull request #19257: [SPARK-22042] [SQL] ReorderJoinPredicates can bre...

2017-09-16 Thread tejasapatil
GitHub user tejasapatil opened a pull request: https://github.com/apache/spark/pull/19257 [SPARK-22042] [SQL] ReorderJoinPredicates can break when child's partitioning is not decided ## What changes were proposed in this pull request? See jira description for the bug :

[GitHub] spark issue #19257: [SPARK-22042] [SQL] ReorderJoinPredicates can break when...

2017-09-16 Thread tejasapatil
Github user tejasapatil commented on the issue: https://github.com/apache/spark/pull/19257 Jenkins test this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19255: [WIP][SPARK-22029] Add lru_cache to _parse_datatype_json...

2017-09-16 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/19255 This PR also needs pref tests and the results in the description. --- - To unsubscribe, e-mail:

[GitHub] spark pull request #19255: [WIP][SPARK-22029] Add lru_cache to _parse_dataty...

2017-09-16 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/19255#discussion_r139298764 --- Diff: python/pyspark/sql/types.py --- @@ -24,6 +24,7 @@ import re import base64 from array import array +from functools import

[GitHub] spark issue #19256: [SPARK-21338][SQL]implement isCascadingTruncateTable() m...

2017-09-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19256 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #19256: [SPARK-21338][SQL]implement isCascadingTruncateTa...

2017-09-16 Thread huaxingao
GitHub user huaxingao opened a pull request: https://github.com/apache/spark/pull/19256 [SPARK-21338][SQL]implement isCascadingTruncateTable() method in Aggr… …egatedDialect ## What changes were proposed in this pull request?

[GitHub] spark issue #18853: [SPARK-21646][SQL] CommonType for binary comparison

2017-09-16 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/18853 Thank you for your investigation! I think we need to introduce a type inference conf for it. To avoid impacting the existing Spark users, we should keep the existing behaviors. ---

[GitHub] spark issue #19246: [SPARK-22025] Speeding up fromInternal for StructField

2017-09-16 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/19246 Could you mark [PySpark] in the title? cc @ueshin --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For

[GitHub] spark pull request #19253: [SPARK-22037][SQL] Collapse Project if it is the ...

2017-09-16 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/19253#discussion_r139297437 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -541,6 +541,25 @@ object CollapseProject extends

[GitHub] spark issue #19253: [SPARK-22037][SQL] Collapse Project if it is the child o...

2017-09-16 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/19253 In whole-stage codegen, this will not gain any perf gain if we can remove the useless Project. --- - To unsubscribe, e-mail:

[GitHub] spark pull request #19253: [SPARK-22037][SQL] Collapse Project if it is the ...

2017-09-16 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/19253#discussion_r139297406 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -541,6 +541,25 @@ object CollapseProject extends

[GitHub] spark issue #19255: [WIP][SPARK-22029] Add lru_cache to _parse_datatype_json...

2017-09-16 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/19255 Could you mark [PySpark] in the title? cc @ueshin --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For

[GitHub] spark issue #19249: [SPARK-22032] Speed up StructType.fromInternal

2017-09-16 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/19249 Could you mark [PySpark] in the title? cc @ueshin --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For

[GitHub] spark issue #19234: [SPARK-22010] Change fromInternal method of TimestampTyp...

2017-09-16 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/19234 Could you mark [PySpark] in the title? cc @ueshin --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For

[GitHub] spark pull request #12646: [SPARK-14878][SQL] Trim characters string functio...

2017-09-16 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/12646#discussion_r139297269 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala --- @@ -503,69 +504,307 @@ case class

[GitHub] spark issue #19254: [MINOR][CORE] Cleanup dead code and duplication in Mem. ...

2017-09-16 Thread original-brownbear
Github user original-brownbear commented on the issue: https://github.com/apache/spark/pull/19254 Test failure in PySpark appears unrelated, various tests OOMed like e.g. ```sh FAILED (errors=2) [Running ] ERROR

[GitHub] spark issue #19254: [MINOR][CORE] Cleanup dead code and duplication in Mem. ...

2017-09-16 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19254 **[Test build #3922 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3922/testReport)** for PR 19254 at commit

[GitHub] spark issue #19255: [WIP][SPARK-22029] Add lru_cache to _parse_datatype_json...

2017-09-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19255 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19255: [WIP][SPARK-22029] Add lru_cache to _parse_datatype_json...

2017-09-16 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19255 **[Test build #81846 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81846/testReport)** for PR 19255 at commit

[GitHub] spark issue #19255: [WIP][SPARK-22029] Add lru_cache to _parse_datatype_json...

2017-09-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19255 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81846/ Test FAILed. ---

[GitHub] spark issue #19255: [WIP][SPARK-22029] Add lru_cache to _parse_datatype_json...

2017-09-16 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19255 **[Test build #81846 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81846/testReport)** for PR 19255 at commit

[GitHub] spark pull request #19255: [WIP][SPARK-22029] Add lru_cache to _parse_dataty...

2017-09-16 Thread maver1ck
Github user maver1ck commented on a diff in the pull request: https://github.com/apache/spark/pull/19255#discussion_r139292791 --- Diff: python/pyspark/sql/types.py --- @@ -24,6 +24,7 @@ import re import base64 from array import array +from functools import

[GitHub] spark pull request #19255: [WIP][SPARK-22029] Add lru_cache to _parse_dataty...

2017-09-16 Thread maver1ck
GitHub user maver1ck opened a pull request: https://github.com/apache/spark/pull/19255 [WIP][SPARK-22029] Add lru_cache to _parse_datatype_json_string ## What changes were proposed in this pull request? _parse_datatype_json_string is called many times for the same

[GitHub] spark pull request #19246: [SPARK-22025] Speeding up fromInternal for Struct...

2017-09-16 Thread maver1ck
Github user maver1ck commented on a diff in the pull request: https://github.com/apache/spark/pull/19246#discussion_r139291502 --- Diff: python/pyspark/sql/types.py --- @@ -410,6 +410,24 @@ def __init__(self, name, dataType, nullable=True, metadata=None):

[GitHub] spark issue #19249: [SPARK-22032] Speed up StructType.fromInternal

2017-09-16 Thread maver1ck
Github user maver1ck commented on the issue: https://github.com/apache/spark/pull/19249 I added benchmark for this code --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands,

[GitHub] spark issue #19254: [MINOR][CORE] Cleanup dead code and duplication in Mem. ...

2017-09-16 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19254 **[Test build #3922 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3922/testReport)** for PR 19254 at commit

[GitHub] spark issue #19234: [SPARK-22010] Change fromInternal method of TimestampTyp...

2017-09-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19234 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81845/ Test PASSed. ---

[GitHub] spark issue #19234: [SPARK-22010] Change fromInternal method of TimestampTyp...

2017-09-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19234 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19234: [SPARK-22010] Change fromInternal method of TimestampTyp...

2017-09-16 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19234 **[Test build #81845 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81845/testReport)** for PR 19234 at commit

[GitHub] spark issue #18323: [SPARK-21117][SQL] Built-in SQL Function Support - WIDTH...

2017-09-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18323 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81843/ Test PASSed. ---

[GitHub] spark issue #18323: [SPARK-21117][SQL] Built-in SQL Function Support - WIDTH...

2017-09-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18323 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #18323: [SPARK-21117][SQL] Built-in SQL Function Support - WIDTH...

2017-09-16 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18323 **[Test build #81843 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81843/testReport)** for PR 18323 at commit

[GitHub] spark issue #19234: [SPARK-22010] Change fromInternal method of TimestampTyp...

2017-09-16 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19234 **[Test build #81845 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81845/testReport)** for PR 19234 at commit

[GitHub] spark pull request #19234: [SPARK-22010] Change fromInternal method of Times...

2017-09-16 Thread maver1ck
Github user maver1ck commented on a diff in the pull request: https://github.com/apache/spark/pull/19234#discussion_r139290042 --- Diff: python/pyspark/sql/types.py --- @@ -196,7 +199,9 @@ def toInternal(self, dt): def fromInternal(self, ts): if ts is not

[GitHub] spark pull request #18659: [SPARK-21190][PYSPARK][WIP] Python Vectorized UDF...

2017-09-16 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/18659#discussion_r139289721 --- Diff: python/pyspark/serializers.py --- @@ -199,6 +211,33 @@ def __repr__(self): return "ArrowSerializer" +class

[GitHub] spark pull request #18659: [SPARK-21190][PYSPARK][WIP] Python Vectorized UDF...

2017-09-16 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/18659#discussion_r139289612 --- Diff: python/pyspark/serializers.py --- @@ -199,6 +211,33 @@ def __repr__(self): return "ArrowSerializer" +class

[GitHub] spark pull request #18659: [SPARK-21190][PYSPARK][WIP] Python Vectorized UDF...

2017-09-16 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/18659#discussion_r139289505 --- Diff: python/pyspark/serializers.py --- @@ -199,6 +211,33 @@ def __repr__(self): return "ArrowSerializer" +class

[GitHub] spark issue #19253: [SPARK-22037][SQL] Collapse Project if it is the child o...

2017-09-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19253 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81844/ Test FAILed. ---

[GitHub] spark issue #19253: [SPARK-22037][SQL] Collapse Project if it is the child o...

2017-09-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19253 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19253: [SPARK-22037][SQL] Collapse Project if it is the child o...

2017-09-16 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19253 **[Test build #81844 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81844/testReport)** for PR 19253 at commit

[GitHub] spark issue #19252: [SPARK-21969][SQL] CommandUtils.updateTableStats should ...

2017-09-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19252 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81842/ Test PASSed. ---

[GitHub] spark issue #19252: [SPARK-21969][SQL] CommandUtils.updateTableStats should ...

2017-09-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19252 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19252: [SPARK-21969][SQL] CommandUtils.updateTableStats should ...

2017-09-16 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19252 **[Test build #81842 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81842/testReport)** for PR 19252 at commit

[GitHub] spark issue #19254: [MINOR][CORE] Cleanup dead code and duplication in Mem. ...

2017-09-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19254 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19230: [SPARK-22003][SQL] support array column in vectorized re...

2017-09-16 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/19230 LGTM too. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark pull request #19254: [MINOR][CORE] Cleanup dead code and duplication i...

2017-09-16 Thread original-brownbear
GitHub user original-brownbear opened a pull request: https://github.com/apache/spark/pull/19254 [MINOR][CORE] Cleanup dead code and duplication in Mem. Management ## What changes were proposed in this pull request? * Removed the method

[GitHub] spark issue #19230: [SPARK-22003][SQL] support array column in vectorized re...

2017-09-16 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/19230 LGTM except some minor comments --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands,

[GitHub] spark pull request #19179: [TRIVIAL][SQL] Cleanup Todo for removal of org.ap...

2017-09-16 Thread original-brownbear
Github user original-brownbear closed the pull request at: https://github.com/apache/spark/pull/19179 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark pull request #19230: [SPARK-22003][SQL] support array column in vector...

2017-09-16 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19230#discussion_r139287957 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/vectorized/ColumnVectorSuite.scala --- @@ -0,0 +1,202 @@ +/* + * Licensed to

[GitHub] spark pull request #19230: [SPARK-22003][SQL] support array column in vector...

2017-09-16 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19230#discussion_r139287815 --- Diff: sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/ColumnVectorUtils.java --- @@ -158,7 +158,7 @@ private static void

[GitHub] spark issue #19253: [SPARK-22037][SQL] Collapse Project if it is the child o...

2017-09-16 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19253 **[Test build #81844 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81844/testReport)** for PR 19253 at commit

[GitHub] spark pull request #19253: [SPARK-22037][SQL] Collapse Project if it is the ...

2017-09-16 Thread gengliangwang
GitHub user gengliangwang opened a pull request: https://github.com/apache/spark/pull/19253 [SPARK-22037][SQL] Collapse Project if it is the child of Aggregate ## What changes were proposed in this pull request? If Aggregate's child is Project, collapse the Project into the

[GitHub] spark issue #18323: [SPARK-21117][SQL] Built-in SQL Function Support - WIDTH...

2017-09-16 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18323 **[Test build #81843 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81843/testReport)** for PR 18323 at commit

[GitHub] spark issue #19218: [SPARK-21786][SQL] The 'spark.sql.parquet.compression.co...

2017-09-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19218 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19218: [SPARK-21786][SQL] The 'spark.sql.parquet.compression.co...

2017-09-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19218 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81841/ Test PASSed. ---

[GitHub] spark issue #19218: [SPARK-21786][SQL] The 'spark.sql.parquet.compression.co...

2017-09-16 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19218 **[Test build #81841 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81841/testReport)** for PR 19218 at commit

[GitHub] spark pull request #19249: [SPARK-22032] Speed up StructType.fromInternal

2017-09-16 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/19249#discussion_r139286221 --- Diff: python/pyspark/sql/types.py --- @@ -619,7 +621,8 @@ def fromInternal(self, obj): # it's already converted by pickler

[GitHub] spark pull request #19249: [SPARK-22032] Speed up StructType.fromInternal

2017-09-16 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/19249#discussion_r139286195 --- Diff: python/pyspark/sql/types.py --- @@ -619,7 +621,8 @@ def fromInternal(self, obj): # it's already converted by pickler

[GitHub] spark issue #19249: [SPARK-22032] Speed up StructType.fromInternal

2017-09-16 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/19249 Okay, then let's go ahead then. Let'd add some numbers in the PR description. --- - To unsubscribe, e-mail:

[GitHub] spark issue #19249: [SPARK-22032] Speed up StructType.fromInternal

2017-09-16 Thread maver1ck
Github user maver1ck commented on the issue: https://github.com/apache/spark/pull/19249 Yep. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #18685: [SPARK-21439] Support for ABCMeta in PySpark

2017-09-16 Thread maver1ck
Github user maver1ck commented on the issue: https://github.com/apache/spark/pull/18685 Ping received. I'll try to add tests and resolve conflict --- - To unsubscribe, e-mail:

[GitHub] spark issue #19249: [SPARK-22032] Speed up StructType.fromInternal

2017-09-16 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/19249 Did it save 6~7% of the total execution time? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For

[GitHub] spark issue #19249: [SPARK-22032] Speed up StructType.fromInternal

2017-09-16 Thread maver1ck
Github user maver1ck commented on the issue: https://github.com/apache/spark/pull/19249 I was checking this with my production code. This give me about 6-7% of speed up and remove 408 millions of function calls :) I'll try to create benchmark for this. ---

[GitHub] spark issue #19252: [SPARK-21969][SQL] CommandUtils.updateTableStats should ...

2017-09-16 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19252 **[Test build #81842 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81842/testReport)** for PR 19252 at commit

[GitHub] spark issue #19249: [SPARK-22032] Speed up StructType.fromInternal

2017-09-16 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/19249 To be honest, it looks too trivial that I won't bother. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark pull request #19252: [SPARK-21969][SQL] CommandUtils.updateTableStats ...

2017-09-16 Thread aokolnychyi
GitHub user aokolnychyi opened a pull request: https://github.com/apache/spark/pull/19252 [SPARK-21969][SQL] CommandUtils.updateTableStats should call refreshTable ## What changes were proposed in this pull request? Tables in the catalog cache are not invalidated once their

[GitHub] spark issue #19249: [SPARK-22032] Speed up StructType.fromInternal

2017-09-16 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/19249 How many functions call does it save and much it improves? I'd not bother fixing this. --- - To unsubscribe, e-mail:

[GitHub] spark pull request #19234: [SPARK-22010] Change fromInternal method of Times...

2017-09-16 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/19234#discussion_r139285021 --- Diff: python/pyspark/sql/types.py --- @@ -196,7 +199,9 @@ def toInternal(self, dt): def fromInternal(self, ts): if ts is not

[GitHub] spark pull request #19234: [SPARK-22010] Change fromInternal method of Times...

2017-09-16 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/19234#discussion_r139284915 --- Diff: python/pyspark/sql/types.py --- @@ -196,7 +199,9 @@ def toInternal(self, dt): def fromInternal(self, ts): if ts is not

[GitHub] spark issue #19246: [SPARK-22025] Speeding up fromInternal for StructField

2017-09-16 Thread maver1ck
Github user maver1ck commented on the issue: https://github.com/apache/spark/pull/19246 @dongjoon-hyun I'll do it on Monday. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #19234: [SPARK-22010] Change fromInternal method of Times...

2017-09-16 Thread maver1ck
Github user maver1ck commented on a diff in the pull request: https://github.com/apache/spark/pull/19234#discussion_r139284824 --- Diff: python/pyspark/sql/types.py --- @@ -196,7 +199,9 @@ def toInternal(self, dt): def fromInternal(self, ts): if ts is not

[GitHub] spark pull request #19234: [SPARK-22010] Change fromInternal method of Times...

2017-09-16 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/19234#discussion_r139284793 --- Diff: python/pyspark/sql/types.py --- @@ -196,7 +199,9 @@ def toInternal(self, dt): def fromInternal(self, ts): if ts is not

[GitHub] spark issue #19218: [SPARK-21786][SQL] The 'spark.sql.parquet.compression.co...

2017-09-16 Thread fjh100456
Github user fjh100456 commented on the issue: https://github.com/apache/spark/pull/19218 @dongjoon-hyun A problem has been encountered, There are two ways to specify the compression format: 1. CREATE TABLE Test(id int) STORED AS ORC TBLPROPERTIES

[GitHub] spark issue #19218: [SPARK-21786][SQL] The 'spark.sql.parquet.compression.co...

2017-09-16 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19218 **[Test build #81841 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81841/testReport)** for PR 19218 at commit

[GitHub] spark issue #19251: [SPARK-22035][SQL]the value of statistical logicalPlan.s...

2017-09-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19251 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #19251: [SPARK-22035][SQL]the value of statistical logica...

2017-09-16 Thread heary-cao
GitHub user heary-cao opened a pull request: https://github.com/apache/spark/pull/19251 [SPARK-22035][SQL]the value of statistical logicalPlan.stats.sizeInBytes which is not expected ## What changes were proposed in this pull request? Currently, assume there will be the

[GitHub] spark issue #18936: [SPARK-21688][ML][MLLIB] make native BLAS the first choi...

2017-09-16 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/18936 BTW I do think this is a promising idea. I'd welcome more info about the performance implications, but if it seems like a net win for most users we should do it. ---

[GitHub] spark issue #19218: [SPARK-21786][SQL] The 'spark.sql.parquet.compression.co...

2017-09-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19218 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19218: [SPARK-21786][SQL] The 'spark.sql.parquet.compression.co...

2017-09-16 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19218 **[Test build #81840 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81840/testReport)** for PR 19218 at commit

[GitHub] spark issue #19218: [SPARK-21786][SQL] The 'spark.sql.parquet.compression.co...

2017-09-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19218 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81840/ Test FAILed. ---

[GitHub] spark pull request #19248: [SPARK-22027] Add missing explanation of default ...

2017-09-16 Thread exKAZUu
Github user exKAZUu commented on a diff in the pull request: https://github.com/apache/spark/pull/19248#discussion_r139282831 --- Diff: mllib/src/main/scala/org/apache/spark/ml/param/shared/sharedParams.scala --- @@ -44,7 +44,7 @@ private[ml] trait HasRegParam extends Params {

[GitHub] spark issue #19218: [SPARK-21786][SQL] The 'spark.sql.parquet.compression.co...

2017-09-16 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19218 **[Test build #81840 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81840/testReport)** for PR 19218 at commit

[GitHub] spark pull request #19248: [SPARK-22027] Add missing explanation of default ...

2017-09-16 Thread exKAZUu
Github user exKAZUu closed the pull request at: https://github.com/apache/spark/pull/19248 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19218: [SPARK-21786][SQL] The 'spark.sql.parquet.compression.co...

2017-09-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19218 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19218: [SPARK-21786][SQL] The 'spark.sql.parquet.compression.co...

2017-09-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19218 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81839/ Test FAILed. ---

[GitHub] spark issue #19218: [SPARK-21786][SQL] The 'spark.sql.parquet.compression.co...

2017-09-16 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19218 **[Test build #81839 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81839/testReport)** for PR 19218 at commit

[GitHub] spark pull request #19180: [SPARK-21967][CORE] org.apache.spark.unsafe.types...

2017-09-16 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/19180 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19244: SPARK-22021

2017-09-16 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/19244#discussion_r139281714 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/GenFuncTransformer.scala --- @@ -0,0 +1,110 @@ +/* + * Licensed to the Apache Software

[GitHub] spark issue #19180: [SPARK-21967][CORE] org.apache.spark.unsafe.types.UTF8St...

2017-09-16 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/19180 Merged to master --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark pull request #19219: [SPARK-21993][SQL][WIP] Close sessionState when f...

2017-09-16 Thread cenyuhai
Github user cenyuhai commented on a diff in the pull request: https://github.com/apache/spark/pull/19219#discussion_r139281683 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveSessionStateBuilder.scala --- @@ -42,7 +42,7 @@ class HiveSessionStateBuilder(session:

[GitHub] spark pull request #19248: [SPARK-22027] Add missing explanation of default ...

2017-09-16 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/19248#discussion_r139281622 --- Diff: mllib/src/main/scala/org/apache/spark/ml/param/shared/sharedParams.scala --- @@ -44,7 +44,7 @@ private[ml] trait HasRegParam extends Params {

  1   2   >