Jatin Sharma created SPARK-39722:
------------------------------------
Summary: Make Dataset.showString() public
Key: SPARK-39722
URL: https://issues.apache.org/jira/browse/SPARK-39722
Project: Spark
Issue Type: Improvement
Components: SQL
Affects Versions: 3.3.0, 2.4.8
Reporter: Jatin Sharma
Currently, we have {{.show}} APIs on a Dataset, but they print directly to
stdout.
But there are a lot of cases where we might need to get a String representation
of the show output. For example
* We have a logging framework to which we need to push the representation of a
df
* We have to send the string over a REST call from the driver
* We want to send the string to stderr instead of stdout
For such cases, currently one needs to do a hack by changing the Console.out
temporarily and catching the representation in a ByteArrayOutputStream or
similar, then extracting the string from it.
Strictly only printing to stdout seems like a limiting choice.
Solution:
We expose APIs to return the String representation back. We already have the
.{{{}showString{}}} method internally.
We could mirror the current {{.show}} APIS with a corresponding {{.showString}}
(and rename the internal private function to something else if required)
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]