[GitHub] spark issue #19601: [SPARK-22383][SQL] Generate code to directly get value o...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19601 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19601: [SPARK-22383][SQL] Generate code to directly get value o...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19601 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83179/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19601: [SPARK-22383][SQL] Generate code to directly get value o...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19601 **[Test build #83179 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83179/testReport)** for PR 19601 at commit [`80b9e31`](https://github.com/apache/spark/commit/80b9e319211765807766e5cf70e995bdbbebf22e). * This patch passes all tests. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * ` public static final class UnsafeArray extends ArrayData ` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19222: [SPARK-10399][CORE][SQL] Introduce multiple Memor...
Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/19222#discussion_r147571324 --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/memory/UnsafeMemoryAllocator.java --- @@ -19,28 +19,48 @@ import org.apache.spark.unsafe.Platform; +import java.lang.reflect.InvocationTargetException; +import java.lang.reflect.Method; +import java.nio.ByteBuffer; + +import sun.nio.ch.DirectBuffer; + /** * A simple {@link MemoryAllocator} that uses {@code Unsafe} to allocate off-heap memory. */ public class UnsafeMemoryAllocator implements MemoryAllocator { @Override - public MemoryBlock allocate(long size) throws OutOfMemoryError { -long address = Platform.allocateMemory(size); -MemoryBlock memory = new MemoryBlock(null, address, size); -if (MemoryAllocator.MEMORY_DEBUG_FILL_ENABLED) { - memory.fill(MemoryAllocator.MEMORY_DEBUG_FILL_CLEAN_VALUE); + public OffHeapMemoryBlock allocate(long size) throws OutOfMemoryError { +Object buffer = ByteBuffer.allocateDirect((int)size); --- End diff -- `Platform.allocateMemory` return a memory address of the region in off-heap. This region will be freed only when we call `Platform.freeMemory`. `ByteBuffer.allocateDirect` return a Java object that wraps a memory region in off-heap. This region will be automatically freed when the Java object is collected by GC. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19602: [SPARK-22384][SQL] Refine partition pruning when ...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/19602#discussion_r147572381 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/client/HiveClientSuite.scala --- @@ -53,7 +52,7 @@ class HiveClientSuite(version: String) for { ds <- 20170101 to 20170103 h <- 0 to 23 -chunk <- Seq("aa", "ab", "ba", "bb") +chunk <- Seq("11", "12", "21", "22") --- End diff -- What is the reason we need to change so many lines in the test case? Could you a new test case, instead of changing the existing one? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19601: [SPARK-22383][SQL] Generate code to directly get value o...
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/19601 @Ueshin @cloud-fan could you please review this? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19222: [SPARK-10399][CORE][SQL] Introduce multiple Memor...
Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/19222#discussion_r147573354 --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/memory/UnsafeMemoryAllocator.java --- @@ -19,28 +19,48 @@ import org.apache.spark.unsafe.Platform; +import java.lang.reflect.InvocationTargetException; +import java.lang.reflect.Method; +import java.nio.ByteBuffer; + +import sun.nio.ch.DirectBuffer; + /** * A simple {@link MemoryAllocator} that uses {@code Unsafe} to allocate off-heap memory. */ public class UnsafeMemoryAllocator implements MemoryAllocator { @Override - public MemoryBlock allocate(long size) throws OutOfMemoryError { -long address = Platform.allocateMemory(size); -MemoryBlock memory = new MemoryBlock(null, address, size); -if (MemoryAllocator.MEMORY_DEBUG_FILL_ENABLED) { - memory.fill(MemoryAllocator.MEMORY_DEBUG_FILL_CLEAN_VALUE); + public OffHeapMemoryBlock allocate(long size) throws OutOfMemoryError { +Object buffer = ByteBuffer.allocateDirect((int)size); --- End diff -- To address issue for releasing a memory, [here](https://hbase.apache.org/0.94/apidocs/src-html/org/apache/hadoop/hbase/util/DirectMemoryUtils.html#line.80) is one solution --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19567: [SPARK-22291][SQL] Conversion error when transforming ar...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19567 **[Test build #83183 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83183/testReport)** for PR 19567 at commit [`f3d45d0`](https://github.com/apache/spark/commit/f3d45d01b2e837d04143ac2a038aa616e5ecbfa3). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19563: [SPARK-22284][SQL] Fix 64KB JVM bytecode limit problem i...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19563 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19567: [SPARK-22291][SQL] Conversion error when transforming ar...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19567 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19563: [SPARK-22284][SQL] Fix 64KB JVM bytecode limit problem i...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19563 **[Test build #83185 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83185/testReport)** for PR 19563 at commit [`70c2304`](https://github.com/apache/spark/commit/70c2304393d6ec967a509fde8746e8268e7c80ee). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19563: [SPARK-22284][SQL] Fix 64KB JVM bytecode limit problem i...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19563 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83185/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19567: [SPARK-22291][SQL] Conversion error when transforming ar...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19567 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83183/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15049: [SPARK-17310][SQL] Add an option to disable record-level...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15049 **[Test build #83186 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83186/testReport)** for PR 15049 at commit [`cdbdbf7`](https://github.com/apache/spark/commit/cdbdbf7fc97710412720e41889c70123b59bcf55). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17899: [SPARK-20636] Add new optimization rule to transpose adj...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17899 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17899: [SPARK-20636] Add new optimization rule to transpose adj...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17899 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83193/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17899: [SPARK-20636] Add new optimization rule to transpose adj...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17899 **[Test build #83193 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83193/testReport)** for PR 17899 at commit [`f840c69`](https://github.com/apache/spark/commit/f840c69231025b53d057ec7e1032750abfa856b8). * This patch **fails Scala style tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19601: [SPARK-22383][SQL] Generate code to directly get value o...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/19601 I'd like to also improve the write path. I think the current way to cache array type is not efficient, arrow-like format which put all elements(including nested array) together is better for encoding and compression. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19601: [SPARK-22383][SQL] Generate code to directly get value o...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/19601 both ways work, just pick the simpler one. I'm concerned about how to access the nested array, you can try both approaches and see which one can solve the problem easier. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19592: [SPARK-22347][SQL][PySpark] Support optionally running P...
Github user viirya commented on the issue: https://github.com/apache/spark/pull/19592 ping @ueshin @BryanCutler @cloud-fan Would you mind to provide some insights? Should we add just a document for it or fix it in your opinions? Thanks. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19601: [SPARK-22383][SQL] Generate code to directly get value o...
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/19601 Current `ColumnVector` uses primitive type array (e.g. `int[]` or `double[]`) based on data type of each column. On the other hand, cached data uses `byte[]` for all data type. Do we change format (`Array[Array[Byte]]`) in [`CachedBatch`](https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/InMemoryRelation.scala#L53) for an primitive array? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17899: [SPARK-20636] Add new optimization rule to transpose adj...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17899 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83189/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17899: [SPARK-20636] Add new optimization rule to transpose adj...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17899 **[Test build #83189 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83189/testReport)** for PR 17899 at commit [`54d88fa`](https://github.com/apache/spark/commit/54d88fabf2076dbe17f22a72526ad7879a4c8eb0). * This patch **fails Scala style tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19601: [SPARK-22383][SQL] Generate code to directly get value o...
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/19601 For now, this implementation has an limitation only to support non-nested array for ease of review. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19567: [SPARK-22291][SQL] Conversion error when transforming ar...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19567 **[Test build #83195 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83195/testReport)** for PR 19567 at commit [`fae5c45`](https://github.com/apache/spark/commit/fae5c455b4a754128bc9112bbead4aef3cc322a2). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19603: [SPARK-22385][SQL] MapObjects should not access list ele...
Github user viirya commented on the issue: https://github.com/apache/spark/pull/19603 Good catch! LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19602: [SPARK-22384][SQL] Refine partition pruning when attribu...
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/19602 @gatorsmile Thanks again for review this pr. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19222: [SPARK-10399][CORE][SQL] Introduce multiple Memor...
Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/19222#discussion_r147573017 --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/memory/UnsafeMemoryAllocator.java --- @@ -19,28 +19,48 @@ import org.apache.spark.unsafe.Platform; +import java.lang.reflect.InvocationTargetException; +import java.lang.reflect.Method; +import java.nio.ByteBuffer; + +import sun.nio.ch.DirectBuffer; + /** * A simple {@link MemoryAllocator} that uses {@code Unsafe} to allocate off-heap memory. */ public class UnsafeMemoryAllocator implements MemoryAllocator { @Override - public MemoryBlock allocate(long size) throws OutOfMemoryError { -long address = Platform.allocateMemory(size); -MemoryBlock memory = new MemoryBlock(null, address, size); -if (MemoryAllocator.MEMORY_DEBUG_FILL_ENABLED) { - memory.fill(MemoryAllocator.MEMORY_DEBUG_FILL_CLEAN_VALUE); + public OffHeapMemoryBlock allocate(long size) throws OutOfMemoryError { +Object buffer = ByteBuffer.allocateDirect((int)size); --- End diff -- I knew that to use `Platform.allocateMemory` is [current design](http://downloads.typesafe.com/website/presentations/ScalaDaysSF2015/T4_Xin_Performance_Optimization.pdf#page=26). I will revert this change. If we want to use `ByteBuffer.allocateDirect`, it would be good to submit a separate PR. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17899: [SPARK-20636] Add new optimization rule to transpose adj...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17899 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19567: [SPARK-22291][SQL] Conversion error when transforming ar...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/19567 LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19222: [SPARK-10399][CORE][SQL] Introduce multiple Memor...
Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/19222#discussion_r147573145 --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/array/LongArray.java --- @@ -33,15 +32,13 @@ private static final long WIDTH = 8; private final MemoryBlock memory; - private final Object baseObj; private final long baseOffset; private final long length; public LongArray(MemoryBlock memory) { assert memory.size() < (long) Integer.MAX_VALUE * 8: "Array size >= Integer.MAX_VALUE elements"; this.memory = memory; -this.baseObj = memory.getBaseObject(); this.baseOffset = memory.getBaseOffset(); --- End diff -- I see. I remove this. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19222: [SPARK-10399][CORE][SQL] Introduce multiple Memor...
Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/19222#discussion_r147573154 --- Diff: sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/UnsafeArrayData.java --- @@ -230,7 +233,10 @@ public UTF8String getUTF8String(int ordinal) { final long offsetAndSize = getLong(ordinal); final int offset = (int) (offsetAndSize >> 32); final int size = (int) offsetAndSize; -return UTF8String.fromAddress(baseObject, baseOffset + offset, size); +MemoryBlock mb = (baseObject instanceof byte[]) ? + new ByteArrayMemoryBlock((byte[]) baseObject, baseOffset + offset, size) : + new LongArrayMemoryBlock((long[]) baseObject, baseOffset + offset, size); --- End diff -- I refactor this and move this logic into `MemoryBlock`. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19563: [SPARK-22284][SQL] Fix 64KB JVM bytecode limit problem i...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19563 **[Test build #83185 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83185/testReport)** for PR 19563 at commit [`70c2304`](https://github.com/apache/spark/commit/70c2304393d6ec967a509fde8746e8268e7c80ee). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19553: [SPARK-22330][CORE] Linear containsKey operation for ser...
Github user srowen commented on the issue: https://github.com/apache/spark/pull/19553 @Tagar it's the same general type of issue, but not directly related nor exactly the same cause. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19222: [SPARK-10399][CORE][SQL] Introduce multiple Memor...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19222#discussion_r147576840 --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/memory/UnsafeMemoryAllocator.java --- @@ -19,28 +19,48 @@ import org.apache.spark.unsafe.Platform; +import java.lang.reflect.InvocationTargetException; +import java.lang.reflect.Method; +import java.nio.ByteBuffer; + +import sun.nio.ch.DirectBuffer; + /** * A simple {@link MemoryAllocator} that uses {@code Unsafe} to allocate off-heap memory. */ public class UnsafeMemoryAllocator implements MemoryAllocator { @Override - public MemoryBlock allocate(long size) throws OutOfMemoryError { -long address = Platform.allocateMemory(size); -MemoryBlock memory = new MemoryBlock(null, address, size); -if (MemoryAllocator.MEMORY_DEBUG_FILL_ENABLED) { - memory.fill(MemoryAllocator.MEMORY_DEBUG_FILL_CLEAN_VALUE); + public OffHeapMemoryBlock allocate(long size) throws OutOfMemoryError { +Object buffer = ByteBuffer.allocateDirect((int)size); --- End diff -- I just wanna understand why we would like to pick `ByteBuffer.allocateDirect` with a hacky free method. Does `ByteBuffer.allocateDirect` have better performance? or you just wanna change the life cycle of `MemoryBlock` and ask GC to control it? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19529: [SPARK-22308] Support alternative unit testing styles in...
Github user srowen commented on the issue: https://github.com/apache/spark/pull/19529 @nkronenfeld @gatorsmile I think this has been failing the master build (Maven only) for a few days: https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-maven-hadoop-2.6/3981/consoleFull ``` SQLQuerySuite: *** RUN ABORTED *** org.apache.spark.SparkException: Only one SparkContext may be running in this JVM (see SPARK-2243). To ignore this error, set spark.driver.allowMultipleContexts = true. The currently running SparkContext was created at: org.apache.spark.sql.test.GenericFunSpecSuite.(GenericFunSpecSuite.scala:28) sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) java.lang.reflect.Constructor.newInstance(Constructor.java:422) java.lang.Class.newInstance(Class.java:442) org.scalatest.tools.DiscoverySuite$.getSuiteInstance(DiscoverySuite.scala:66) org.scalatest.tools.DiscoverySuite$$anonfun$1.apply(DiscoverySuite.scala:38) org.scalatest.tools.DiscoverySuite$$anonfun$1.apply(DiscoverySuite.scala:37) scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) scala.collection.Iterator$class.foreach(Iterator.scala:893) scala.collection.AbstractIterator.foreach(Iterator.scala:1336) scala.collection.IterableLike$class.foreach(IterableLike.scala:72) scala.collection.AbstractIterable.foreach(Iterable.scala:54) scala.collection.TraversableLike$class.map(TraversableLike.scala:234) scala.collection.AbstractTraversable.map(Traversable.scala:104) org.scalatest.tools.DiscoverySuite.(DiscoverySuite.scala:37) org.scalatest.tools.Runner$.genDiscoSuites$1(Runner.scala:1165) org.scalatest.tools.Runner$.doRunRunRunDaDoRunRun(Runner.scala:1250) ``` I suspect that there's some slightly different way that the tests execute in Maven that may highlight a problem with how the SQL tests use and reuse SparkContexts. It's likely this change did cause it. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15049: [SPARK-17310][SQL] Add an option to disable record-level...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15049 **[Test build #83184 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83184/testReport)** for PR 15049 at commit [`86c863a`](https://github.com/apache/spark/commit/86c863aa3f0de6aed9d77a50660003a64c36f98f). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15049: [SPARK-17310][SQL] Add an option to disable record-level...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15049 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83184/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15049: [SPARK-17310][SQL] Add an option to disable record-level...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15049 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19567: [SPARK-22291][SQL] Conversion error when transfor...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19567#discussion_r147578399 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JdbcUtils.scala --- @@ -456,8 +456,10 @@ object JdbcUtils extends Logging { case StringType => (array: Object) => -array.asInstanceOf[Array[java.lang.String]] - .map(UTF8String.fromString) +// some underling types are not String such as uuid, inet, cidr, etc. +array.asInstanceOf[Array[java.lang.Object]] + .map(obj => UTF8String.fromString( +if (obj == null) null else obj.toString)) --- End diff -- nit: `if (obj == null) null else UTF8String.fromString(obj.toString)` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19567: [SPARK-22291][SQL] Conversion error when transfor...
Github user jmchung commented on a diff in the pull request: https://github.com/apache/spark/pull/19567#discussion_r147579149 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JdbcUtils.scala --- @@ -456,8 +456,10 @@ object JdbcUtils extends Logging { case StringType => (array: Object) => -array.asInstanceOf[Array[java.lang.String]] - .map(UTF8String.fromString) +// some underling types are not String such as uuid, inet, cidr, etc. +array.asInstanceOf[Array[java.lang.Object]] + .map(obj => UTF8String.fromString( +if (obj == null) null else obj.toString)) --- End diff -- Thanks @cloud-fan ! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17899: [SPARK-20636] Add new optimization rule to transpose adj...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17899 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17899: [SPARK-20636] Add new optimization rule to transpose adj...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17899 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83191/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17899: [SPARK-20636] Add new optimization rule to transpose adj...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17899 **[Test build #83191 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83191/testReport)** for PR 17899 at commit [`82d7390`](https://github.com/apache/spark/commit/82d7390e284b9324c75f4a0755f91fb3127f4d7f). * This patch **fails Scala style tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19603: [SPARK-22385][SQL] MapObjects should not access l...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19603#discussion_r147579950 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala --- @@ -591,18 +591,40 @@ case class MapObjects private( case _ => inputData.dataType } -val (getLength, getLoopVar) = inputDataType match { +val (getLength, prepareLoop, getLoopVar) = inputDataType match { case ObjectType(cls) if classOf[Seq[_]].isAssignableFrom(cls) => --- End diff -- you can use `seqObj.isInstanceOf[Seq[_]]`, but not a `Class` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18251: [SPARK-17788][SPARK-21033][SQL] fix the potential OOM in...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18251 **[Test build #83187 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83187/testReport)** for PR 18251 at commit [`f58173c`](https://github.com/apache/spark/commit/f58173c2497d5996633e5faff9d4a6bfbb70cbe1). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18251: [SPARK-17788][SPARK-21033][SQL] fix the potential OOM in...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18251 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83180/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18251: [SPARK-17788][SPARK-21033][SQL] fix the potential OOM in...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18251 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18251: [SPARK-17788][SPARK-21033][SQL] fix the potential OOM in...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18251 **[Test build #83180 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83180/testReport)** for PR 18251 at commit [`d4e3021`](https://github.com/apache/spark/commit/d4e30212552900a45913c7f1866dbfbe49d57317). * This patch **fails due to an unknown error code, -9**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `public class TooLargePageException extends RuntimeException ` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19602: [SPARK-22384][SQL] Refine partition pruning when attribu...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19602 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19602: [SPARK-22384][SQL] Refine partition pruning when attribu...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19602 **[Test build #83181 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83181/testReport)** for PR 19602 at commit [`26c7a7b`](https://github.com/apache/spark/commit/26c7a7be273265d8b25ed3fdf4d6fd3b55f225d8). * This patch **fails due to an unknown error code, -9**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19602: [SPARK-22384][SQL] Refine partition pruning when attribu...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19602 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83181/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19222: [SPARK-10399][CORE][SQL] Introduce multiple Memor...
Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/19222#discussion_r147573035 --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/memory/OffHeapMemoryBlock.java --- @@ -0,0 +1,103 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.unsafe.memory; + +import org.apache.spark.unsafe.Platform; + +public class OffHeapMemoryBlock extends MemoryBlock { + private Object directBuffer; --- End diff -- `directBuffer` is used to control lifetime of `DirectByteBuffer`. Since I decided not to use `DirectByteBuffer` in this PR, I will remove this field `directBuffer`. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15049: [SPARK-17310][SQL] Add an option to disable record-level...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15049 **[Test build #83184 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83184/testReport)** for PR 15049 at commit [`86c863a`](https://github.com/apache/spark/commit/86c863aa3f0de6aed9d77a50660003a64c36f98f). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19567: [SPARK-22291][SQL] Conversion error when transforming ar...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19567 **[Test build #83183 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83183/testReport)** for PR 19567 at commit [`f3d45d0`](https://github.com/apache/spark/commit/f3d45d01b2e837d04143ac2a038aa616e5ecbfa3). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15049: [SPARK-17310][SQL] Add an option to disable recor...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/15049#discussion_r147573054 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFilterSuite.scala --- @@ -555,6 +572,32 @@ class ParquetFilterSuite extends QueryTest with ParquetTest with SharedSQLContex } } } + + test("Filters should be pushed down for Parquet readers at row group level") { --- End diff -- @jiangxb1987, here I added a test to make sure disabling 'spark.sql.parquet.recordFilter' still enables row group level filtering. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19222: [SPARK-10399][CORE][SQL] Introduce multiple Memor...
Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/19222#discussion_r147573040 --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/memory/HeapMemoryAllocator.java --- @@ -73,6 +73,12 @@ public MemoryBlock allocate(long size) throws OutOfMemoryError { @Override public void free(MemoryBlock memory) { +assert(memory instanceof ByteArrayMemoryBlock || memory instanceof IntArrayMemoryBlock || --- End diff -- Good catch, you are right. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19563: [SPARK-22284][SQL] Fix 64KB JVM bytecode limit pr...
Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/19563#discussion_r147573635 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/hash.scala --- @@ -389,9 +389,10 @@ abstract class HashExpression[E] extends Expression { input: String, result: String, fields: Array[StructField]): String = { -fields.zipWithIndex.map { case (field, index) => +val hashes = fields.zipWithIndex.map { case (field, index) => nullSafeElementHash(input, index.toString, field.nullable, field.dataType, result, ctx) -}.mkString("\n") +} +ctx.splitExpressions(hashes, "apply", ("InternalRow", input) :: Nil) --- End diff -- Good catch, done --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19563: [SPARK-22284][SQL] Fix 64KB JVM bytecode limit pr...
Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/19563#discussion_r147573624 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/HashExpressionsSuite.scala --- @@ -639,6 +639,63 @@ class HashExpressionsSuite extends SparkFunSuite with ExpressionEvalHelper { assert(hiveHashPlan(wideRow).getInt(0) == hiveHashEval) } + test("SPARK-22284: Compute hash for nested structs") { +val M = 80 +val N = 10 +val L = M * N +val O = 50 +val seed = 42 + +val wideRow1 = new GenericInternalRow(Seq.tabulate(O)(j => +new GenericInternalRow(Seq.tabulate(L)(i => + new GenericInternalRow(Array[Any]( +UTF8String.fromString((j * L + i).toString + .toArray[Any])).toArray[Any]) +var inner1 = new StructType() --- End diff -- I see, done --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19602: [SPARK-22384][SQL] Refine partition pruning when attribu...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19602 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83182/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19602: [SPARK-22384][SQL] Refine partition pruning when attribu...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19602 **[Test build #83182 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83182/testReport)** for PR 19602 at commit [`a3a978e`](https://github.com/apache/spark/commit/a3a978eb612cdc5944c05951e5a6eb9457ae6962). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19602: [SPARK-22384][SQL] Refine partition pruning when attribu...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19602 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19601: [SPARK-22383][SQL] Generate code to directly get value o...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/19601 My feeling is that, we should change the cache format of array type to make it compatible with `ColumnVector`, then we don't need conversion from cached data to columnar batch. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19601: [SPARK-22383][SQL] Generate code to directly get ...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19601#discussion_r147578474 --- Diff: sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/ColumnVector.java --- @@ -367,9 +551,13 @@ public Object get(int ordinal, DataType dataType) { /** * Returns the array at rowid. */ - public final ColumnVector.Array getArray(int rowId) { -resultArray.length = getArrayLength(rowId); -resultArray.offset = getArrayOffset(rowId); + public final ArrayData getArray(int rowId) { --- End diff -- We should not change the return type. `ColumnVector` will be public eventually, and `ArrayData` is not a public type. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17899: [SPARK-20636] Add new optimization rule to transpose adj...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17899 **[Test build #83189 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83189/testReport)** for PR 17899 at commit [`54d88fa`](https://github.com/apache/spark/commit/54d88fabf2076dbe17f22a72526ad7879a4c8eb0). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19567: [SPARK-22291][SQL] Conversion error when transforming ar...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19567 **[Test build #83190 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83190/testReport)** for PR 19567 at commit [`588902d`](https://github.com/apache/spark/commit/588902d21fb12bf80169edc74097d7bda950668c). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19601: [SPARK-22383][SQL] Generate code to directly get value o...
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/19601 I agree with you that we need to improve the write path. It will be addressed after improving the frequently-executed read path, as you suggested before. It will be addressed by the following PR. For improving the read path, which approach is better? To add new `ColumnVector.Array` or to add new `WritableColumnVector`? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19567: [SPARK-22291][SQL] Conversion error when transfor...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19567#discussion_r147581064 --- Diff: external/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/PostgresIntegrationSuite.scala --- @@ -134,11 +149,28 @@ class PostgresIntegrationSuite extends DockerJDBCIntegrationSuite { assert(schema(1).dataType == ShortType) } - test("SPARK-20557: column type TIMESTAMP with TIME ZONE and TIME with TIME ZONE should be recognized") { + test("SPARK-20557: column type TIMESTAMP with TIME ZONE and TIME with TIME ZONE " + +"should be recognized") { val dfRead = sqlContext.read.jdbc(jdbcUrl, "ts_with_timezone", new Properties) val rows = dfRead.collect() val types = rows(0).toSeq.map(x => x.getClass.toString) assert(types(1).equals("class java.sql.Timestamp")) assert(types(2).equals("class java.sql.Timestamp")) } + + test("SPARK-22291: PostgreSQL UUID[] to StringType: Conversion Error") { --- End diff -- Let's change this test title too. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19603: [SPARK-22385][SQL] MapObjects should not access l...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19603#discussion_r147581519 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala --- @@ -591,18 +591,40 @@ case class MapObjects private( case _ => inputData.dataType } -val (getLength, getLoopVar) = inputDataType match { +val (getLength, prepareLoop, getLoopVar) = inputDataType match { case ObjectType(cls) if classOf[Seq[_]].isAssignableFrom(cls) => -s"${genInputData.value}.size()" -> s"${genInputData.value}.apply($loopIndex)" +val it = ctx.freshName("it") +( + s"${genInputData.value}.size()", + s"scala.collection.Iterator $it = ${genInputData.value}.toIterator();", --- End diff -- I think it is better to leave a comment for code readers. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19603: [SPARK-22385][SQL] MapObjects should not access l...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/19603#discussion_r147581528 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala --- @@ -591,18 +591,40 @@ case class MapObjects private( case _ => inputData.dataType } -val (getLength, getLoopVar) = inputDataType match { +val (getLength, prepareLoop, getLoopVar) = inputDataType match { case ObjectType(cls) if classOf[Seq[_]].isAssignableFrom(cls) => --- End diff -- Oops, right, not reading this clearly. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19603: [SPARK-22385][SQL] MapObjects should not access l...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19603#discussion_r147581548 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala --- @@ -591,18 +591,40 @@ case class MapObjects private( case _ => inputData.dataType } -val (getLength, getLoopVar) = inputDataType match { +val (getLength, prepareLoop, getLoopVar) = inputDataType match { case ObjectType(cls) if classOf[Seq[_]].isAssignableFrom(cls) => -s"${genInputData.value}.size()" -> s"${genInputData.value}.apply($loopIndex)" +val it = ctx.freshName("it") --- End diff -- I agreed. Currently the branches are a bit complicated. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19603: [SPARK-22385][SQL] MapObjects should not access list ele...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19603 **[Test build #83188 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83188/testReport)** for PR 19603 at commit [`6cb4fca`](https://github.com/apache/spark/commit/6cb4fca89e83172407114037d3a447ae6d941f0a). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19603: [SPARK-22385][SQL] MapObjects should not access list ele...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19603 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83188/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19603: [SPARK-22385][SQL] MapObjects should not access list ele...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19603 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19597: [SPARK-22375][TEST] Test script can fail if eggs are ins...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/19597 @xynny, mind if I ask your JIRA id? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19567: [SPARK-22291][SQL] Conversion error when transforming ar...
Github user jmchung commented on the issue: https://github.com/apache/spark/pull/19567 gentle ping @cloud-fan and @viirya, there are some feedbacks about the behavior of obj to string. ``` scala case StringType => (array: Object) => array.asInstanceOf[Array[java.lang.Object]] .map(obj => UTF8String.fromString(obj.toString)) ``` As @HyukjinKwon mentioned before, it may unsafe to covnert the object to string directly, we'll have the `java.lang.NullPointerException` in test cases. `StringType` is different from other types use `nullSafeConvert` method to deal with the null, it process within `UTF8String.fromString`. The `UTF8String.fromString` returns null and UTF8String so that we cannot use `String.valueOf(obj)` to avoid the NPE, it will raise another errors (because `null != "null"`): ``` - Type mapping for various types *** FAILED *** Array("a", "null", "b") did not equal List("a", null, "b") (PostgresIntegrationSuite.scala:120) ``` IMHO, if we don't want to make separation of match cases, we may remain null to meet the original design, e.g., ``` scala case StringType => (array: Object) => array.asInstanceOf[Array[java.lang.Object]] .map(obj => UTF8String.fromString( if (obj == null) null else obj.toString)) ``` Alternative is we keep the String and Object cases? Would you mind give some comments? I can keep working on this :) --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19222: [SPARK-10399][CORE][SQL] Introduce multiple Memor...
Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/19222#discussion_r147573486 --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/memory/MemoryBlock.java --- @@ -17,47 +17,168 @@ package org.apache.spark.unsafe.memory; -import javax.annotation.Nullable; - import org.apache.spark.unsafe.Platform; +import javax.annotation.Nullable; + /** - * A consecutive block of memory, starting at a {@link MemoryLocation} with a fixed size. + * A declaration of interfaces of MemoryBlock classes . */ -public class MemoryBlock extends MemoryLocation { +public abstract class MemoryBlock { + @Nullable + protected final Object obj; - private final long length; + protected final long offset; + + protected final long length; /** * Optional page number; used when this MemoryBlock represents a page allocated by a - * TaskMemoryManager. This field is public so that it can be modified by the TaskMemoryManager, - * which lives in a different package. + * TaskMemoryManager. This field can be updated using setPageNumber method so that + * this can be modified by the TaskMemoryManager, which lives in a different package. */ - public int pageNumber = -1; + private int pageNumber = -1; public MemoryBlock(@Nullable Object obj, long offset, long length) { -super(obj, offset); +this.obj = obj; +this.offset = offset; this.length = length; } + public MemoryBlock() { +this(null, 0, 0); + } + + public final Object getBaseObject() { +return obj; + } + + public final long getBaseOffset() { +return offset; + } + /** * Returns the size of the memory block. */ - public long size() { + public final long size() { return length; } - /** - * Creates a memory block pointing to the memory used by the long array. - */ - public static MemoryBlock fromLongArray(final long[] array) { -return new MemoryBlock(array, Platform.LONG_ARRAY_OFFSET, array.length * 8L); + public final void setPageNumber(int pageNum) { +pageNumber = pageNum; + } + + public final int getPageNumber() { +return pageNumber; } /** * Fills the memory block with the specified byte value. */ - public void fill(byte value) { + public final void fill(byte value) { Platform.setMemory(obj, offset, length, value); } + + /** + * Instantiate the same type of MemoryBlock with new offset and size + */ + public abstract MemoryBlock allocate(long offset, long size); + + + public abstract int getInt(long offset); + + public abstract void putInt(long offset, int value); + + public abstract boolean getBoolean(long offset); + + public abstract void putBoolean(long offset, boolean value); + + public abstract byte getByte(long offset); + + public abstract void putByte(long offset, byte value); + + public abstract short getShort(long offset); + + public abstract void putShort(long offset, short value); + + public abstract long getLong(long offset); + + public abstract void putLong(long offset, long value); + + public abstract float getFloat(long offset); + + public abstract void putFloat(long offset, float value); + + public abstract double getDouble(long offset); + + public abstract void putDouble(long offset, double value); + + public abstract Object getObjectVolatile(long offset); + + public abstract void putObjectVolatile(long offset, Object value); + + public static final void copyMemory( --- End diff -- `copyMemory` has two kinds of arguments: `src` and `dst`. If we make them non-static method, it is not easy to understand whether `this` is `src` or `dst`. If we keep method name `MemoryBlock.copy`, I think that it is good to keep current form (it is common in `memcpy`). Or, can we prepare two methods `MemoryBlock.copyFrom(srcOffset, dst...)` and `MemoryBlock.copyTo(src..., dstOffset)`. @cloud-fan What do you think? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19600: Added more information to Imputer
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19600 **[Test build #3963 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3963/testReport)** for PR 19600 at commit [`2350fc1`](https://github.com/apache/spark/commit/2350fc1de3eaa0a3aafd7aba11c5d99b39d4c606). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19553: [SPARK-22330][CORE] Linear containsKey operation ...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19553#discussion_r147576141 --- Diff: core/src/main/scala/org/apache/spark/api/java/JavaUtils.scala --- @@ -43,10 +43,15 @@ private[spark] object JavaUtils { override def size: Int = underlying.size -override def get(key: AnyRef): B = try { - underlying.getOrElse(key.asInstanceOf[A], null.asInstanceOf[B]) -} catch { - case ex: ClassCastException => null.asInstanceOf[B] +// Delegate to implementation because AbstractMap implementation iterates over whole key set +override def containsKey(key: AnyRef): Boolean = key match { + case key: A => underlying.contains(key) --- End diff -- +1 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18251: [SPARK-17788][SPARK-21033][SQL] fix the potential OOM in...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18251 **[Test build #83187 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83187/testReport)** for PR 18251 at commit [`f58173c`](https://github.com/apache/spark/commit/f58173c2497d5996633e5faff9d4a6bfbb70cbe1). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19603: [SPARK-22385][SQL] MapObjects should not access list ele...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19603 **[Test build #83188 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83188/testReport)** for PR 19603 at commit [`6cb4fca`](https://github.com/apache/spark/commit/6cb4fca89e83172407114037d3a447ae6d941f0a). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15049: [SPARK-17310][SQL] Add an option to disable record-level...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15049 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19601: [SPARK-22383][SQL] Generate code to directly get ...
Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/19601#discussion_r147578600 --- Diff: sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/ColumnVector.java --- @@ -367,9 +551,13 @@ public Object get(int ordinal, DataType dataType) { /** * Returns the array at rowid. */ - public final ColumnVector.Array getArray(int rowId) { -resultArray.length = getArrayLength(rowId); -resultArray.offset = getArrayOffset(rowId); + public final ArrayData getArray(int rowId) { --- End diff -- I see. One question. `ColumnVector.Array` has some public fields such as `length`. I think that it would be good to use an accessor `length` or `getLength`. What do you think? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15049: [SPARK-17310][SQL] Add an option to disable record-level...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15049 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83186/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19601: [SPARK-22383][SQL] Generate code to directly get value o...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/19601 So for primitive types, we encode and compress them to binary. When reading cached data, they are decoded to primitive array and can be put in `OnHeadColumnVector` directly. For primitive type array, we treat it as binary. So when decoding it, we get a byte[] and need more effort to convert it to primitive type and put in `OnHeadColumnVector`. Can we change how we encode array type like Arrow did? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19603: [SPARK-22385][SQL] MapObjects should not access l...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19603#discussion_r147580091 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala --- @@ -591,18 +591,40 @@ case class MapObjects private( case _ => inputData.dataType } -val (getLength, getLoopVar) = inputDataType match { +val (getLength, prepareLoop, getLoopVar) = inputDataType match { case ObjectType(cls) if classOf[Seq[_]].isAssignableFrom(cls) => -s"${genInputData.value}.size()" -> s"${genInputData.value}.apply($loopIndex)" +val it = ctx.freshName("it") +( + s"${genInputData.value}.size()", + s"scala.collection.Iterator $it = ${genInputData.value}.toIterator();", + s"$it.next()" +) case ObjectType(cls) if cls.isArray => -s"${genInputData.value}.length" -> s"${genInputData.value}[$loopIndex]" +( + s"${genInputData.value}.length", + "", + s"${genInputData.value}[$loopIndex]" +) case ObjectType(cls) if classOf[java.util.List[_]].isAssignableFrom(cls) => --- End diff -- The legal external typea for catalyst array are scala `Seq` and java `List`, so we don't need to match a more general type here. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17899: [SPARK-20636] Add new optimization rule to transpose adj...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17899 **[Test build #83193 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83193/testReport)** for PR 17899 at commit [`f840c69`](https://github.com/apache/spark/commit/f840c69231025b53d057ec7e1032750abfa856b8). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19567: [SPARK-22291][SQL] Conversion error when transforming ar...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/19567 LGTM too --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19567: [SPARK-22291][SQL] Conversion error when transfor...
Github user jmchung commented on a diff in the pull request: https://github.com/apache/spark/pull/19567#discussion_r147581349 --- Diff: external/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/PostgresIntegrationSuite.scala --- @@ -134,11 +149,28 @@ class PostgresIntegrationSuite extends DockerJDBCIntegrationSuite { assert(schema(1).dataType == ShortType) } - test("SPARK-20557: column type TIMESTAMP with TIME ZONE and TIME with TIME ZONE should be recognized") { + test("SPARK-20557: column type TIMESTAMP with TIME ZONE and TIME with TIME ZONE " + +"should be recognized") { val dfRead = sqlContext.read.jdbc(jdbcUrl, "ts_with_timezone", new Properties) val rows = dfRead.collect() val types = rows(0).toSeq.map(x => x.getClass.toString) assert(types(1).equals("class java.sql.Timestamp")) assert(types(2).equals("class java.sql.Timestamp")) } + + test("SPARK-22291: PostgreSQL UUID[] to StringType: Conversion Error") { --- End diff -- Oops, I forgot it, thanks!! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18251: [SPARK-17788][SPARK-21033][SQL] fix the potential OOM in...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18251 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83187/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18251: [SPARK-17788][SPARK-21033][SQL] fix the potential OOM in...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18251 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19602: [SPARK-22384][SQL] Refine partition pruning when attribu...
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/19602 @gatorsmile Thanks a lot for your help :) >Can we just evaluate the right side CAST(2017 as STRING), since it is foldable? Do you mean to add a new rule ? -- cast the type of literal when partition key and the literal are different types. In current code, `StringType` are promoted(e.g. casted to `IntegerType`) in `PromoteStrings` when it is BinaryComparison (https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala#L366) --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19602: [SPARK-22384][SQL] Refine partition pruning when attribu...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19602 **[Test build #83182 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83182/testReport)** for PR 19602 at commit [`a3a978e`](https://github.com/apache/spark/commit/a3a978eb612cdc5944c05951e5a6eb9457ae6962). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19597: [SPARK-22375][TEST] Test script can fail if eggs are ins...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/19597 Merged to master. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19602: [SPARK-22384][SQL] Refine partition pruning when attribu...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19602 **[Test build #83181 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83181/testReport)** for PR 19602 at commit [`26c7a7b`](https://github.com/apache/spark/commit/26c7a7be273265d8b25ed3fdf4d6fd3b55f225d8). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19222: [SPARK-10399][CORE][SQL] Introduce multiple Memor...
Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/19222#discussion_r147578071 --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/memory/UnsafeMemoryAllocator.java --- @@ -19,28 +19,48 @@ import org.apache.spark.unsafe.Platform; +import java.lang.reflect.InvocationTargetException; +import java.lang.reflect.Method; +import java.nio.ByteBuffer; + +import sun.nio.ch.DirectBuffer; + /** * A simple {@link MemoryAllocator} that uses {@code Unsafe} to allocate off-heap memory. */ public class UnsafeMemoryAllocator implements MemoryAllocator { @Override - public MemoryBlock allocate(long size) throws OutOfMemoryError { -long address = Platform.allocateMemory(size); -MemoryBlock memory = new MemoryBlock(null, address, size); -if (MemoryAllocator.MEMORY_DEBUG_FILL_ENABLED) { - memory.fill(MemoryAllocator.MEMORY_DEBUG_FILL_CLEAN_VALUE); + public OffHeapMemoryBlock allocate(long size) throws OutOfMemoryError { +Object buffer = ByteBuffer.allocateDirect((int)size); --- End diff -- This is easy to manage life cycle of native memory without calling `free`. This code was derived from #11494. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19601: [SPARK-22383][SQL] Generate code to directly get value o...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/19601 Can we use `OffHeapColumnVector` for cached data? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19603: [SPARK-22385][SQL] MapObjects should not access l...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19603#discussion_r147580056 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala --- @@ -591,18 +591,40 @@ case class MapObjects private( case _ => inputData.dataType } -val (getLength, getLoopVar) = inputDataType match { +val (getLength, prepareLoop, getLoopVar) = inputDataType match { case ObjectType(cls) if classOf[Seq[_]].isAssignableFrom(cls) => -s"${genInputData.value}.size()" -> s"${genInputData.value}.apply($loopIndex)" +val it = ctx.freshName("it") --- End diff -- That's not a big deal, we are traversing the collection from start to end, so either Iterator or accessing by index won't have much difference, may not worth to create 2 branches here. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17899: [SPARK-20636] Add new optimization rule to transpose adj...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17899 **[Test build #83194 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83194/testReport)** for PR 17899 at commit [`e9f6928`](https://github.com/apache/spark/commit/e9f6928bb60e7c4de25324e5572f105a30d16cd5). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19567: [SPARK-22291][SQL] Conversion error when transforming ar...
Github user viirya commented on the issue: https://github.com/apache/spark/pull/19567 LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org