[GitHub] spark pull request #21370: [SPARK-24215][PySpark] Implement eager evaluation...

xuanyuanking Tue, 12 Jun 2018 08:29:19 -0700

Github user xuanyuanking commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21370#discussion_r194784664
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ---
    @@ -3209,6 +3222,19 @@ class Dataset[T] private[sql](
         }
       }
     
    +  private[sql] def getRowsToPython(
    +      _numRows: Int,
    +      truncate: Int,
    +      vertical: Boolean): Array[Any] = {
    +    EvaluatePython.registerPicklers()
    +    val numRows = _numRows.max(0).min(Int.MaxValue - 1)
    +    val rows = getRows(numRows, truncate, vertical).map(_.toArray).toArray
    +    val toJava: (Any) => Any = EvaluatePython.toJava(_, 
ArrayType(ArrayType(StringType)))
    +    val iter: Iterator[Array[Byte]] = new SerDeUtil.AutoBatchedPickler(
    +      rows.iterator.map(toJava))
    +    PythonRDD.serveIterator(iter, "serve-GetRows")
    --- End diff --
    
    Same answer with @HyukjinKwon about the return type, and actually the exact 
return type we need here is Array[Array[String]], this defined in `toJava` func.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #21370: [SPARK-24215][PySpark] Implement eager evaluation...

Reply via email to