dkranchii commented on issue #55632:
URL: https://github.com/apache/spark/issues/55632#issuecomment-4356609827
Thanks for the suggestion. Spark has a tool that does both of the things
you're asking for — it's just `from_json` (with `schema_of_json` for
inference), not `json_tuple`.
### Use `from_json` for named, typed columns
```python
from pyspark.sql.functions import from_json, schema_of_json, col, lit
# Option 1: explicit schema — named columns, real types
df.select(from_json("jstring", "f1 STRING, f2 INT").alias("j")) \
.select("j.f1", "j.f2")
# Option 2: inferred schema from a sample row — also named, also typed
sample = df.select("jstring").first()[0]
df.select(from_json("jstring", schema_of_json(lit(sample))).alias("j")) \
.select("j.*") # expands to f1, f2, ... with inferred types
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]