have you tried using "." access method?

e.g:
ds1.select("name","addresses[0].element.city")

On Sun, Nov 20, 2016 at 9:59 AM, shyla deshpande <deshpandesh...@gmail.com>
wrote:

> The following my dataframe schema
>
>     root
>      |-- name: string (nullable = true)
>      |-- addresses: array (nullable = true)
>      |    |-- element: struct (containsNull = true)
>      |    |    |-- street: string (nullable = true)
>      |    |    |-- city: string (nullable = true)
>
> I want to output name and city. The following is my spark streaming app
> which outputs name and addresses, but I want name and cities in the output.
>
>     object PersonConsumer {
>       import org.apache.spark.sql.{SQLContext, SparkSession}
>       import com.example.protos.demo._
>
>       def main(args : Array[String]) {
>
>         val spark = SparkSession.builder.
>           master("local")
>           .appName("spark session example")
>           .getOrCreate()
>
>         import spark.implicits._
>
>         val ds1 = spark.readStream.format("kafka").
>           option("kafka.bootstrap.servers","localhost:9092").
>           option("subscribe","person").load()
>
>         val ds2 = ds1.map(row=> row.getAs[Array[Byte]]("value"
> )).map(Person.parseFrom(_)).select($"name", $"addresses")
>
>         ds2.printSchema()
>
>         val query = ds2.writeStream
>           .outputMode("append")
>           .format("console")
>           .start()
>
>         query.awaitTermination()
>       }
>     }
>
> Appreciate your help. Thanks.
>



-- 
Thanks,
Pandeeswaran

Reply via email to