[I] Ambiguous reference error for named columns [arrow-datafusion]

via GitHub Tue, 10 Oct 2023 16:17:36 -0700


Blajda opened a new issue, #7790:
URL: https://github.com/apache/arrow-datafusion/issues/7790


   ### Describe the bug
   
   I'm using the dataframe API to perform a join. I can build a join without 
issue however attempting to add an additional column results in a failure. This 
is the logical plan
   
   ```
   DataFrame {
       session_state: SessionState {
           session_id: "56e65554-2665-46a7-8f3f-6839b25e542c",
       },
       plan: Full Join:  Filter: target.id = source.id
         Projection: source.id, source.value, source.modified, Boolean(true) AS 
__delta_rs_source
           TableScan: source
         Projection: target.id, target.value, target.modified, Boolean(true) AS 
__delta_rs_target
           TableScan: target,
   }
   ```
   
   With the following error being given
   ```
   Result::unwrap()` on an `Err` value: Generic("Schema error: Ambiguous 
reference to unqualified field id")
   ```
   
   ### To Reproduce
   
   Original code that caused this issue is here: 
https://github.com/Blajda/delta-rs/blob/merge-logical/rust/src/operations/merge.rs#L649
   
   Codes that reproduces that issue
   
   ```rust
   let schema = Arc::new(ArrowSchema::new(vec![
       Field::new("id", DataType::Utf8, true),
       Field::new("value", DataType::Int32, true),
       Field::new("modified", DataType::Utf8, true),
   ]));
   
   let ctx = SessionContext::new();
   let batch = RecordBatch::try_new(
       Arc::clone(&schema),
       vec![
           Arc::new(arrow::array::StringArray::from(vec!["B", "C", "X"])),
           Arc::new(arrow::array::Int32Array::from(vec![10, 20, 30])),
           Arc::new(arrow::array::StringArray::from(vec![
               "2021-02-02",
               "2023-07-04",
               "2023-07-04",
           ])),
       ],
   )
   .unwrap();
   let source = ctx.read_batch(batch).unwrap();
   
   let batch = RecordBatch::try_new(
       Arc::clone(&schema),
       vec![
           Arc::new(arrow::array::StringArray::from(vec!["B", "D", "X"])),
           Arc::new(arrow::array::Int32Array::from(vec![10, 20, 30])),
           Arc::new(arrow::array::StringArray::from(vec![
               "2021-02-02",
               "2023-07-04",
               "2023-07-04",
           ])),
       ],
   )
   .unwrap();
   let target = ctx.read_batch(batch).unwrap();
   
   let source_name = TableReference::bare("source");
   let source =
       LogicalPlanBuilder::scan(source_name, 
provider_as_source(source.into_view()), None)
           .unwrap()
           .build()
           .unwrap();
   let source = DataFrame::new(ctx.state(), source);
   
   let target_name = TableReference::bare("source");
   let target =
       LogicalPlanBuilder::scan(target_name, 
provider_as_source(target.into_view()), None)
           .unwrap()
           .build()
           .unwrap();
   let target = DataFrame::new(ctx.state(), target);
   
   let join = source
       .join(
           target,
           datafusion_common::JoinType::Full,
           &[],
           &[],
           Some(col("source.id").eq(col("target.id"))),
       )
       .unwrap();
   let proj = join.with_column("test123", lit(true)).unwrap();
   proj.show().await.unwrap();
   ```
   
   ### Expected behavior
   
   I should be able to add a new unique columns to this Dataframe
   
   ### Additional context
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[I] Ambiguous reference error for named columns [arrow-datafusion]

Reply via email to