Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/23120#discussion_r237046616
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/FailureSafeParser.scala
---
@@ -33,26 +33,21 @@ class FailureSafeParser[IN](
private val corruptFieldIndex =
schema.getFieldIndex(columnNameOfCorruptRecord)
private val actualSchema = StructType(schema.filterNot(_.name ==
columnNameOfCorruptRecord))
private val resultRow = new GenericInternalRow(schema.length)
- private val nullResult = new GenericInternalRow(schema.length)
// This function takes 2 parameters: an optional partial result, and the
bad record. If the given
// schema doesn't contain a field for corrupted record, we just return
the partial result or a
// row with all fields null. If the given schema contains a field for
corrupted record, we will
// set the bad record to this field, and set other fields according to
the partial result or null.
private val toResultRow: (Option[InternalRow], () => UTF8String) =>
InternalRow = {
- if (corruptFieldIndex.isDefined) {
- (row, badRecord) => {
- var i = 0
- while (i < actualSchema.length) {
- val from = actualSchema(i)
- resultRow(schema.fieldIndex(from.name)) = row.map(_.get(i,
from.dataType)).orNull
- i += 1
- }
- resultRow(corruptFieldIndex.get) = badRecord()
- resultRow
+ (row, badRecord) => {
--- End diff --
without this change in `FailureSafeParser`, does JSON support returning
partial result?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]