[GitHub] spark pull request #22237: [SPARK-25243][SQL] Use FailureSafeParser in from_...

MaxGekk Wed, 10 Oct 2018 04:56:22 -0700

Github user MaxGekk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22237#discussion_r224044435
  
    --- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/FailureSafeParser.scala
 ---
    @@ -15,50 +15,57 @@
      * limitations under the License.
      */
     
    -package org.apache.spark.sql.execution.datasources
    +package org.apache.spark.sql.catalyst.util
     
     import org.apache.spark.SparkException
     import org.apache.spark.sql.catalyst.InternalRow
     import org.apache.spark.sql.catalyst.expressions.GenericInternalRow
    -import org.apache.spark.sql.catalyst.util._
    -import org.apache.spark.sql.internal.SQLConf
    -import org.apache.spark.sql.types.StructType
    +import org.apache.spark.sql.types.{DataType, StructType}
     import org.apache.spark.unsafe.types.UTF8String
     
     class FailureSafeParser[IN](
    --- End diff --
    
    Frankly speaking I don't fully understand the idea. Let's look at an 
example. We should parser JSON arrays (one array per row) like:
    ```
    [1, 2, 3]
    [4, 5]
    ```
    and an user provided the schema `ArrayType(IntegerType, true)`. So, you 
propose to wrap the array by `StructType(Seq(StructField(ArrayType(IntegerType, 
...))))`, right? And use the code inside of `JacksonParser` which we disabled 
by `allowArrayAsStructs` for now?



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #22237: [SPARK-25243][SQL] Use FailureSafeParser in from_...

Reply via email to