viirya edited a comment on pull request #31296:
URL: https://github.com/apache/spark/pull/31296#issuecomment-766530579
This is an automated message from the Apache Git Service.
To respond to the message, please log on to
viirya edited a comment on pull request #31296:
URL: https://github.com/apache/spark/pull/31296#issuecomment-766568619
Hmm, I'm fine if you think we should always require a custom function to
produce the output.
This is an
viirya edited a comment on pull request #31296:
URL: https://github.com/apache/spark/pull/31296#issuecomment-766530579
> Is it too hard requirement to explain the actual use case, especially
you've said you have internal customer claiming this feature? I don't think my
request requires
viirya edited a comment on pull request #31296:
URL: https://github.com/apache/spark/pull/31296#issuecomment-766530579
> Is it too hard requirement to explain the actual use case, especially
you've said you have internal customer claiming this feature? I don't think my
request requires
viirya edited a comment on pull request #31296:
URL: https://github.com/apache/spark/pull/31296#issuecomment-766291831
> I understand the functionality is lacking on SS. There's a workaround like
foreachBatch -> toRDD -> pipe but streaming operations can't be added after
calling pipe. So
viirya edited a comment on pull request #31296:
URL: https://github.com/apache/spark/pull/31296#issuecomment-765879991
> Please just create an executable which prints out stdin (serialized data)
and passes to the pipe API... I think it's the easiest way to realize.
Ok, ok. I didn't
viirya edited a comment on pull request #31296:
URL: https://github.com/apache/spark/pull/31296#issuecomment-765878887
> It is an issue because encoder only specifies how an object would map to
the internal physical structure of the row, and by exposing this pipe API, we
are exposing the
viirya edited a comment on pull request #31296:
URL: https://github.com/apache/spark/pull/31296#issuecomment-765877154
> Yes the question is also applied to RDD.pipe as well, but the
serialization is done via `OutputStreamWriter.println` which is relatively
"known" - `String.valueOf(T)`