narrowizard opened a new issue, #8423:
URL: https://github.com/apache/incubator-devlake/issues/8423

   ### Search before asking
   
   - [x] I had searched in the 
[issues](https://github.com/apache/incubator-devlake/issues?q=is%3Aissue) and 
found no similar issues.
   
   
   ### What happened
   
   When synchronizing data from TAPD, the DevLake pipeline fails with an error. 
The error message indicates an issue with unmarshaling a JSON number into a Go 
struct field. Specifically, it's trying to unmarshal the value `-1` into the 
`TapdStoryCategory.Category.id` field, which is of type `uint64`.
   
   The error log shows:
   ```
   json: cannot unmarshal number -1 into Go struct field 
TapdStoryCategory.Category.id of type uint64
   ```
   
   A query against the raw data table `_raw_tapd_api_story_categories` confirms 
that the TAPD API is indeed returning `"-1"` for the `id` field within the 
`Category` object for some records.
   
   **Raw Data Confirmation (from DevLake raw table):**
   
![Image](https://github.com/user-attachments/assets/ce4e0719-f655-4064-b7bb-72f35128c3cb)
   A query like `select convert_from(data, 'UTF8') from 
_raw_tapd_api_story_categories where (convert_from(data, 
'UTF8')::json->>'Category'->>'id')::bigint < 0 limit 10;` shows records where 
`Category.id` is `"-1"`. The raw JSON data for such an entry would look similar 
to:
   `{"Category":{"id":"-1", "workspace_id":"39091999", ...}}`
   
   ### What do you expect to happen
   
   The DevLake TAPD connector should be able to handle `Category.id` values of 
`"-1"` without crashing the synchronization pipeline. This could involve:
   1.  Changing the data type of `TapdStoryCategory.Category.id` in the Go 
struct to a type that can accommodate negative numbers (e.g., `int64`), 
allowing `-1` to be stored directly if it's treated as a valid, albeit special, 
ID.
   2.  Alternatively, given that the raw data shows `category_id = -1` 
corresponds to `name = "未分类"` (Uncategorized), the connector should handle this 
specific case. For instance, it could transform this `id` to `null` in the 
domain layer if 'Uncategorized' implies the absence of a specific category, or 
map it to a predefined constant/sentinel value representing 'Uncategorized' if 
that's more suitable for the DevLake domain model. This would prevent the 
unmarshaling error while preserving the meaning that the item is not assigned 
to a standard category.
   
   The synchronization should complete successfully, and data containing these 
`Category.id = -1` (Uncategorized) values should be processed and stored 
appropriately in the domain layer tables.
   
   ### How to reproduce
   
   1.  Configure a TAPD connection in DevLake.
   2.  Ensure the TAPD project being synchronized contains story categories 
where the API returns `Category.id = "-1"`. (This appears to be a valid 
response from the TAPD API as shown by the raw data).
   3.  Trigger a data synchronization for this TAPD connection, specifically 
including the collection of "story_categories" or entities that depend on them.
   4.  Observe the pipeline execution logs for the unmarshaling error related 
to `TapdStoryCategory.Category.id`.
   
   
   ### Anything else
   
   _No response_
   
   ### Version
   
   main
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [x] I agree to follow this project's [Code of 
Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@devlake.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to