Hi All, How to use value (schema) of one of the columns of a dataset to parse another column and create a flattened dataset using Spark Streaming 2.2.0?
I have the following *source data frame* that I create from reading messages from Kafka col1: string col2: json string col1 | col2 --------------------------------------------------------------------------- schemaUri1 | "{"name": "foo", "zipcode": 11111}" schemaUri2 | "{"name": "bar", "zipcode": 11112, "id": 1234}" schemaUri1 | "{"name": "foobar", "zipcode": 11113}" schemaUri2 | "{"name": "barfoo", "zipcode": 11114, "id": 1235, "interest": "reading"}" *My target data frame* name | zipcode | id | interest -------------------------------- foo | 11111 | null | null bar | 11112 | 1234 | null foobar | 11113 | null | null barfoo | 11114 | 1235 | reading *Assume you have the following function* // This function returns a StructType that represents a schema for a given schemaUri public StructType getSchema(String schemaUri)