Hi All:

I am using Spark SQL 1.0.1 for a simple test, the loaded data (JSON format)
which is registered as table "people" is: 

{"name":"Michael",
"schools":[{"name":"ABC","time":1994},{"name":"EFG","time":2000}]}
{"name":"Andy", "age":30,"scores":{"eng":98,"phy":89}}
{"name":"Justin", "age":19}

the schools has repeated value {"name":"XXX","time":X}, how should I write
the SQL to select the people who has schools with name "ABC"? I have tried
"SELECT name FROM people WHERE schools.name = 'ABC' ",but seems wrong with:

[error] (run-main-0)
org.apache.spark.sql.catalyst.errors.package$TreeNodeException: Unresolved
attributes: 'name, tree:
[error] Project ['name]
[error]  Filter ('schools.name = ABC)
[error]   Subquery people
[error]    ParquetRelation people.parquet, Some(Configuration:
core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml)
org.apache.spark.sql.catalyst.errors.package$TreeNodeException: Unresolved
attributes: 'name, tree:

Project ['name]
 Filter ('schools.name = ABC)
  Subquery people
   ParquetRelation people.parquet, Some(Configuration: core-default.xml,
core-site.xml, mapred-default.xml, mapred-site.xml)

        at
org.apache.spark.sql.catalyst.analysis.Analyzer$CheckResolution$$anonfun$apply$1.applyOrElse(Analyzer.scala:71)
...

Could anybody show me how to write a right SQL for the repeated data item
search in Spark SQL? Thank you!





--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Repeated-data-item-search-with-Spark-SQL-1-0-1-tp9544.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Reply via email to