Jefffrey commented on issue #9122:
URL: 
https://github.com/apache/arrow-datafusion/issues/9122#issuecomment-1925906027

   I guess this is an unfortunate side effect of having `JSON_TABLE` be a 
keyword for a MySQL function.
   
   You can remedy this by escaping it in the SQL via backticks to indicate it 
shouldn't be a keyword.
   
   From your reproduction:
   
   ```rust
   pub fn set_up_data_test() -> Result<SessionContext> {
   
       // define a schema.
       let schema = Arc::new(Schema::new(vec![
           Field::new("index", DataType::UInt8, false),
           Field::new("json_data", DataType::Utf8, true),
       ]));
   
       // define data.
       let batch = RecordBatch::try_new(
           schema,
           vec![
               Arc::new(UInt8Array::from_iter_values([1, 2, 3, 4, 5, 6, 7, 8])),
               Arc::new(StringArray::from(vec![
                   Some(r#" { "this" : "is", "a": [ "test" ] } "#),
                   Some(r#"{"a":[2,3.5,true,false,null,"x"]}"#),
                   Some(r#"[ "one", "two" ]"#),
                   Some(r#"123"#),
                   Some(r#"12.3"#),
                   Some(r#"true"#),
                   Some(r#"false"#),
                   None,
               ])),
           ],
       )?;
   
       // declare a new context
       let ctx = SessionContext::new();
       // declare a table in memory.
       ctx.register_batch("json_table", batch)?;
   
       // define a schema.
       let schema = Arc::new(Schema::new(vec![
           Field::new("index", DataType::UInt8, false),
           Field::new("json_data", DataType::Utf8, true),
       ]));
   
       let batch = RecordBatch::try_new(
           schema,
           vec![
               Arc::new(UInt8Array::from_iter_values([1, 2, 3, 4, 5, 6, 7, 8])),
               Arc::new(StringArray::from(vec![
                   Some(r#" { "this" : "is", "a": [ "test" ] } "#),
                   Some(r#"{"a":[2,3.5,true,false,null,"x"]}"#),
                   Some(r#"[ "one", "two" ]"#),
                   Some(r#"123"#),
                   Some(r#"12.3"#),
                   Some(r#"true"#),
                   Some(r#"false"#),
                   None,
               ])),
           ],
       )?;
   
       // declare a new context
       let ctx = SessionContext::new(); // <-------------- HERE: ctx redefined 
loses above registration of json_table
       // declare a table in memory.
       ctx.register_batch("json_value_table", batch)?;
   
       Ok(ctx)
   }
   
   #[tokio::main]
   async fn main() -> Result<()> {
   
       let ctx = set_up_data_test()?;
   
       // Fails with Error: SQL(ParserError("Expected (, found: EOF"), None)
       let df = ctx
           .sql(
               r#"SELECT * FROM `json_table`"#, // <------- HERE: escape with 
backticks
           )
           .await?;
   
       df.clone().show().await?;
   
   
       Ok(())
   }
   ```
   
   - Technically double quotes could be used as well: `r#"SELECT * FROM 
"json_table""#`
   
   I also noticed something strange in your reproduction code, where you 
redefine the `ctx` in `set_up_data_test()` meaning you lose the registration of 
the `json_table` table anyway.
   
   - Taking as of 
https://github.com/dadepo/df-repro/commit/05a0a2ec1fb05e341a00fc79ea02e7ff11c2ba42
   - Unless this was an intention redefinition of `ctx` for testing purposes
   
   We could probably update the docs, like here for example 
https://arrow.apache.org/datafusion/user-guide/sql/select.html
   
   To indicate quoting isn't only for respecting capitalization, but also for 
using keywords as table/column names


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to