Yes, but if both tagCollection and selectedVideos have a column named "id" then Spark SQL does not know which one you are referring to in the where clause. Here's an example with aliases:
val x = testData2.as('x) val y = testData2.as('y) val join = x.join(y, Inner, Some("x.a".attr === "y.a".attr)) On Wed, Jul 16, 2014 at 2:47 AM, Jaonary Rabarisoa <jaon...@gmail.com> wrote: > My query is just a simple query that use the spark sql dsl : > > tagCollection.join(selectedVideos).where('videoId === 'id) > > > > > On Tue, Jul 15, 2014 at 6:03 PM, Yin Huai <huaiyin....@gmail.com> wrote: > >> Hi Jao, >> >> Seems the SQL analyzer cannot resolve the references in the Join >> condition. What is your query? Did you use the Hive Parser (your query was >> submitted through hql(...)) or the basic SQL Parser (your query was >> submitted through sql(...)). >> >> Thanks, >> >> Yin >> >> >> On Tue, Jul 15, 2014 at 8:52 AM, Jaonary Rabarisoa <jaon...@gmail.com> >> wrote: >> >>> Hi all, >>> >>> When running a join operation with Spark SQL I got the following error : >>> >>> >>> Exception in thread "main" >>> org.apache.spark.sql.catalyst.errors.package$TreeNodeException: Ambiguous >>> references to id: (id#303,List()),(id#0,List()), tree: >>> Filter ('videoId = 'id) >>> Join Inner, None >>> ParquetRelation data/tags.parquet >>> Filter (name#1 = P1/cam1) >>> ParquetRelation data/videos.parquet >>> >>> >>> What does it mean ? >>> >>> >>> Cheers, >>> >>> >>> jao >>> >> >> >