tuky191 commented on issue #546:
URL: 
https://github.com/apache/pulsar-client-go/issues/546#issuecomment-1242780920

   While working on the project, I've also encountered this problem. Compiled a 
pulsar (2.7.2) from source so I could debug the sql-worker. 
   
   When using the json schema, go client marshals the Pulsar's message Value 
into json([]byte) and sends it to pulsar. The names of the columns are decided 
at this point based on the json tags (if present) or the names of struct's data 
fields(if not). 
   
   Your schema has to reflect this. In other words, if your column's name is 
'id' after marshalling, it HAS to be also 'id' in the AVRO schema. If it would 
be 'ID' as in  
[schema_test](https://github.com/apache/pulsar-client-go/blob/master/pulsar/schema_test.go)
 then this discrepancy breaks the presto-sql queries.
   
   
   ### Not working
   ```sh
   type testJSON struct {
        ID   int    `json:"id"`
        Name string `json:"name"`
   }
    exampleSchemaDef = 
"{\"type\":\"record\",\"name\":\"Example\",\"namespace\":\"test\"," +
                
"\"fields\":[{\"name\":\"ID\",\"type\":\"int\"},{\"name\":\"Name\",\"type\":\"string\"}]}"
   ```
   
   ```sh
   presto> select * from pulsar."public/default".goJson;
     id  | name | __partition__ |     __event_time__      |    __publish_time__ 
    | __message_id__ | __sequence_id__ | __producer_name__ | __key_>
   
------+------+---------------+-------------------------+-------------------------+----------------+-----------------+-------------------+------->
    NULL | NULL |            -1 | 2339-03-21 22:18:14.838 | 2022-09-10 
17:42:06.748 | (10,0,0)       |               0 | standalone-0-0    | NULL  >
   (1 row)
   
   Query 20220910_174314_00000_zqxmh, FINISHED, 1 node
   Splits: 18 total, 18 done (100.00%)
   0:02 [1 rows, 90B] [0 rows/s, 46B/s]
   ```
   ### Working 
   
   ```sh
   type testJSON struct {
        ID   int    `json:"id"`
        Name string `json:"name"`
   }
   
   exampleSchemaDef = 
"{\"type\":\"record\",\"name\":\"Example\",\"namespace\":\"test\"," +
                
"\"fields\":[{\"name\":\"id\",\"type\":\"int\"},{\"name\":\"name\",\"type\":\"string\"}]}"
   ```
   
   ```sh
   presto> select * from pulsar."public/default".go_json;
    id  |  name  | __partition__ |     __event_time__      |    
__publish_time__     | __message_id__ | __sequence_id__ | __producer_name__ | 
__key>
   
-----+--------+---------------+-------------------------+-------------------------+----------------+-----------------+-------------------+------>
    120 | pulsar |            -1 | 2339-03-21 22:18:14.838 | 2022-09-10 
17:45:37.752 | (13,0,0)       |               0 | standalone-0-3    | NULL >
    120 | pulsar |            -1 | 2339-03-21 22:18:14.838 | 2022-09-10 
17:45:40.231 | (13,1,0)       |               0 | standalone-0-4    | NULL >
   (2 rows)
   
   Query 20220910_174546_00002_zqxmh, FINISHED, 1 node
   Splits: 18 total, 18 done (100.00%)
   0:00 [2 rows, 270B] [5 rows/s, 707B/s]
   ```
   
   Verified this works also on the pulsar:latest (2.10.1). 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to