Github user Dooyoung-Hwang commented on the issue: https://github.com/apache/spark/pull/22347 @kiszk It is impossible counting decoded rows without modify SparkPlan, because there is no way of counting iterated size. Instead I can simulate this patch in Scala WorkSheet with below code. ```scala var decodeCount = 0 def decoding(buf: Array[Int]): Iterator[String] = { new Iterator[String] { var remain = buf.sum var index = 0 override def hasNext: Boolean = remain > 0 override def next(): String = { while (buf(index) == 0) index += 1 buf(index) -= 1 remain -= 1 decodeCount += 1 // increase decodeCount f"[decode Result:$remain]" } } } // reset decodeCount decodeCount = 0 // Before Patch : decode without scala view val buf = new ArrayBuffer[String] val inputIter = Array(Array(2, 2, 2), Array(2), Array(2)).iterator while (inputIter.hasNext) buf ++= Array(inputIter.next()).flatMap(decoding) val result1 = buf.take(3).toArray // ensure decode count is 10 assert(decodeCount == 10) // reset decodeCount decodeCount = 0 // After Patch : decode with scala view val result2 = ArrayBuffer(Array(2, 2, 2), Array(2), Array(2)).toArray.view .flatMap(decoding).take(3).force // ensure decode count is 3 assert(decodeCount == 3) // assert same element assert(result1 sameElements result2) ```
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org