Re: [I] Feature request: infer field names in json_tuple [spark]

via GitHub Thu, 30 Apr 2026 15:33:39 -0700


dkranchii commented on issue #55632:
URL: https://github.com/apache/spark/issues/55632#issuecomment-4356609827


   Thanks for the suggestion. Spark has a tool that does both of the things 
you're asking for — it's just `from_json` (with `schema_of_json` for 
inference), not `json_tuple`. 
   
   ### Use `from_json` for named, typed columns
   
   ```python
   from pyspark.sql.functions import from_json, schema_of_json, col, lit
   
   # Option 1: explicit schema — named columns, real types
   df.select(from_json("jstring", "f1 STRING, f2 INT").alias("j")) \
     .select("j.f1", "j.f2")
   
   # Option 2: inferred schema from a sample row — also named, also typed
   sample = df.select("jstring").first()[0]
   df.select(from_json("jstring", schema_of_json(lit(sample))).alias("j")) \
     .select("j.*")    # expands to f1, f2, ... with inferred types


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [I] Feature request: infer field names in json_tuple [spark]

Reply via email to