The join condition with && is throwing an exception:   

 val df = baseDF.join(mccDF, mccDF("medical_claim_id") <=>
baseDF("medical_claim_id")
      && mccDF("medical_claim_detail_id") <=>
baseDF("medical_claim_detail_id"), "left")
      .join(revCdDF, revCdDF("revenue_code_padded_str") <=>
mccDF("mcc_code"), "left")
      .select(baseDF("medical_claim_id"), baseDF("medical_claim_detail_id"),
baseDF("revenue_code"), baseDF("rev_code_distinct_count"),
        baseDF("rtos_1_1_count"), baseDF("rtos_1_0_count"),
baseDF("er_visit_flag"), baseDF("observation_stay_flag"),
        revCdDF("rtos_2_code"), revCdDF("rtos_2_hierarchy"))
      .where(revCdDF("rtos_2_code").between(8, 27).isNotNull)
      .groupBy(
        baseDF("medical_claim_id"),
        baseDF("medical_claim_detail_id")
      )
      .agg(min(revCdDF("rtos_2_code").alias("min_rtos_2_8_thru_27")),
min(revCdDF("rtos_2_hierarchy").alias("min_rtos_2_8_thru_27_hier")))


This query runs fine:

val df = baseDF.join(mccDF, mccDF("medical_claim_id") <=>
baseDF("medical_claim_id"), "left")
        .join(mccDF, mccDF("medical_claim_detail_id") <=>
baseDF("medical_claim_detail_id"), "left")
      .join(revCdDF, revCdDF("revenue_code_padded_str") <=>
mccDF("mcc_code"), "left")
      .select(baseDF("medical_claim_id"), baseDF("medical_claim_detail_id"),
baseDF("revenue_code"), baseDF("rev_code_distinct_count"),
        baseDF("rtos_1_1_count"), baseDF("rtos_1_0_count"),
baseDF("er_visit_flag"), baseDF("observation_stay_flag"),
        revCdDF("rtos_2_code"), revCdDF("rtos_2_hierarchy"))
      .where(revCdDF("rtos_2_code").between(8, 27).isNotNull)
      .groupBy(
        baseDF("medical_claim_id"),
        baseDF("medical_claim_detail_id")
      )
      .agg(min(revCdDF("rtos_2_code").alias("min_rtos_2_8_thru_27")),
min(revCdDF("rtos_2_hierarchy").alias("min_rtos_2_8_thru_27_hier")))

If I remove the multiple Columns in the join and create a join statement for
each one then the exception goes away.  Is there a better way to join
multiple columns?





--
View this message in context: 
http://apache-spark-developers-list.1001551.n3.nabble.com/Scala-left-join-with-multiple-columns-Join-condition-is-missing-or-trivial-Use-the-CROSS-JOIN-syntax-tp21297.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

Reply via email to