ResilientSZUer opened a new issue, #873: URL: https://github.com/apache/incubator-seata-go/issues/873
### ✅ 验证清单 - [x] 🔍 我已经搜索过 [现有 Issues](https://github.com/apache/incubator-seata-go/issues),确信这不是重复问题 ### 🚀 Go 版本 go1.22.3 windows/amd64 ### 📦 Seata-go 版本 v2.0.0(v1.2.0同样有这个bug) ### 💾 操作系统 🪟 Windows ### 📝 Bug 描述 在AT模式下,当事务回滚涉及到包含某些特定类型(如VARCHAR, TEXT, JSON, DECIMAL等)的表时,回滚流程会因`rows.Scan()`参数数量不匹配的错误而中断。 问题的根源在于`pkg/datasource/sql/datasource/utils.go`文件中的`GetScanSlice`函数。 ```golang // pkg/datasource/sql/datasource/utils.go // ... type nullTime = sql.NullTime var ( ScanTypeFloat32 = reflect.TypeOf(float32(0)) ScanTypeFloat64 = reflect.TypeOf(float64(0)) ScanTypeInt8 = reflect.TypeOf(int8(0)) ScanTypeInt16 = reflect.TypeOf(int16(0)) ScanTypeInt32 = reflect.TypeOf(int32(0)) ScanTypeInt64 = reflect.TypeOf(int64(0)) ScanTypeNullFloat = reflect.TypeOf(sql.NullFloat64{}) ScanTypeNullInt = reflect.TypeOf(sql.NullInt64{}) ScanTypeNullTime = reflect.TypeOf(nullTime{}) ScanTypeUint8 = reflect.TypeOf(uint8(0)) ScanTypeUint16 = reflect.TypeOf(uint16(0)) ScanTypeUint32 = reflect.TypeOf(uint32(0)) ScanTypeUint64 = reflect.TypeOf(uint64(0)) ScanTypeRawBytes = reflect.TypeOf(sql.RawBytes{}) ScanTypeUnknown = reflect.TypeOf(new(interface{})) ) func GetScanSlice(types []*sql.ColumnType) []interface{} { scanSlice := make([]interface{}, 0, len(types)) for _, tpy := range types { switch tpy.ScanType() { case ScanTypeFloat32: scanVal := float32(0) scanSlice = append(scanSlice, &scanVal) case ScanTypeFloat64: scanVal := float64(0) scanSlice = append(scanSlice, &scanVal) case ScanTypeInt8: scanVal := int8(0) scanSlice = append(scanSlice, &scanVal) case ScanTypeInt16: scanVal := int16(0) scanSlice = append(scanSlice, &scanVal) case ScanTypeInt32: scanVal := int32(0) scanSlice = append(scanSlice, &scanVal) case ScanTypeInt64: scanVal := int64(0) scanSlice = append(scanSlice, &scanVal) case ScanTypeNullFloat: scanVal := sql.NullFloat64{} scanSlice = append(scanSlice, &scanVal) case ScanTypeNullInt: scanVal := sql.NullInt64{} scanSlice = append(scanSlice, &scanVal) case ScanTypeNullTime: scanVal := sql.NullTime{} scanSlice = append(scanSlice, &scanVal) case ScanTypeUint8: scanVal := uint8(0) scanSlice = append(scanSlice, &scanVal) case ScanTypeUint16: scanVal := uint16(0) scanSlice = append(scanSlice, &scanVal) case ScanTypeUint32: scanVal := uint32(0) scanSlice = append(scanSlice, &scanVal) case ScanTypeUint64: scanVal := uint64(0) scanSlice = append(scanSlice, &scanVal) case ScanTypeRawBytes: scanVal := "" scanSlice = append(scanSlice, &scanVal) case ScanTypeUnknown: scanVal := new(interface{}) scanSlice = append(scanSlice, &scanVal) } } return scanSlice } // ... ``` 原代码在实现此函数时,对Golang中关于类型系统的语法知识产生了误解。`GetScanSlice`函数中`case ScanTypeUnknown:`这一行代码的本意明显是希望作为一个能处理所有未匹配类型的兜底方案,但是,`ScanTypeUnknown`是 `reflect.TypeOf(new(interface{}))`,在Golang的类型系统中代表的是`*interface{}`类型,即一个指向空接口`interface{}`的指针类型,**这个类型本身就是独一无二的类型**,并不能起到通配作用。原代码大概率是混淆了Golang中【赋值】和【类型比较】这两个完全不同的概念,错误地将任意一个值可以被赋给`interface{}`变量(因为一个具体类型的值,能不能赋给一个接口类型的变量,只看这个具体类型有没有实现接口要求的所有方法,而`interface{}`本身没有任何方法,所以才能赋值)这个赋值行为的规则,等同于任意一个类型的`reflect.Type`==`*interface{}`类型的`reflect. Type`。 ```golang package main import ( "database/sql" "fmt" "reflect" "time" ) func main() { // ScanTypeUnknown的本质是*interface{}类型,*interface{}类型本身就是一个独一无二的类型,并不能起到通配作用 scanTypeUnknown := reflect.TypeOf(new(interface{})) fmt.Printf("ScanTypeUnknown的: %s\n", scanTypeUnknown) fmt.Printf("比较string 和 ScanTypeUnknown: %t\n", reflect.TypeOf("") == scanTypeUnknown) fmt.Printf("比较sql.NullString 和 ScanTypeUnknown: %t\n", reflect.TypeOf(sql.NullString{}) == scanTypeUnknown) fmt.Printf("比较int和ScanTypeUnknown: %t\n", reflect.TypeOf(0) == scanTypeUnknown) fmt.Printf("比较int32和ScanTypeUnknown: %t\n", reflect.TypeOf(int32(0)) == scanTypeUnknown) fmt.Printf("比较sql.NullInt64和ScanTypeUnknown: %t\n", reflect.TypeOf(sql.NullInt64{}) == scanTypeUnknown) fmt.Printf("比较float64和ScanTypeUnknown: %t\n", reflect.TypeOf(0.0) == scanTypeUnknown) fmt.Printf("比较sql.NullFloat64和ScanTypeUnknown: %t\n", reflect.TypeOf(sql.NullFloat64{}) == scanTypeUnknown) fmt.Printf("比较bool和ScanTypeUnknown: %t\n", reflect.TypeOf(false) == scanTypeUnknown) fmt.Printf("比较sql.NullBool和ScanTypeUnknown: %t\n", reflect.TypeOf(sql.NullBool{}) == scanTypeUnknown) fmt.Printf("比较[]byte和ScanTypeUnknown: %t\n", reflect.TypeOf([]byte{}) == scanTypeUnknown) fmt.Printf("比较time.Time和ScanTypeUnknown: %t\n", reflect.TypeOf(time.Time{}) == scanTypeUnknown) } ``` 运行结果: <img width="1818" height="1080" alt="Image" src="https://github.com/user-attachments/assets/fd99c255-3362-4996-adcc-dc5b51a361b8" /> 可见,全都是false,`ScanTypeUnknown`并不能起到兜底作用。 ### 🔄 重现步骤 1. 准备数据库表结构。在数据库中创建`products`(商品表)和`orders`(订单表)。 ```sql CREATE TABLE `products` ( `id` bigint NOT NULL, `seller_id` bigint DEFAULT NULL, `name` varchar(255) DEFAULT NULL, `description` text, `price` decimal(10,2) DEFAULT NULL, `stock` int DEFAULT NULL, `status` tinyint DEFAULT NULL, `image_urls_json` json, `video_info` json, `created_at` datetime(3) DEFAULT NULL, `updated_at` datetime(3) DEFAULT NULL, `deleted_at` datetime(3) DEFAULT NULL, PRIMARY KEY (`id`) ); CREATE TABLE `orders` ( `id` bigint NOT NULL, `user_id` bigint NOT NULL, `seller_id` bigint NOT NULL, `total_amount` decimal(10,2) NOT NULL, `payment_amount` decimal(10,2) NOT NULL, `status` tinyint NOT NULL, `shipping_address` varchar(512) NOT NULL, `tracking_number` varchar(100) DEFAULT NULL, `remark` varchar(255) DEFAULT NULL, `created_at` datetime(3) DEFAULT NULL, `updated_at` datetime(3) DEFAULT NULL, `paid_at` datetime(3) DEFAULT NULL, `shipped_at` datetime(3) DEFAULT NULL, `completed_at` datetime(3) DEFAULT NULL, `canceled_at` datetime(3) DEFAULT NULL, `cancel_reason` varchar(255) DEFAULT NULL, `deleted_at` datetime(3) DEFAULT NULL, PRIMARY KEY (`id`) ); ``` 2. 编写一个“创建订单”的分布式事务业务。该业务在一个Seata全局事务中,处理一个包含两种商品的订单: 其中第一种商品的库存充足,第二种商品的库存不足。 3. 触发回滚。业务代码在执行完第一步操作,尝试执行第二步操作时,检查发现库存不足,返回错误。这个错误会触发Seata的回滚机制,Seata将尝试回滚已经成功的第一个扣减库存的操作。 ### ✅ 预期行为 Seata AT模式应该成功回滚第一个扣减库存的操作。`products`表中第一种商品的`stock`字段应当恢复到事务开始前的值。`undo_log`表中对应的记录被清理。 ### ❌ 实际行为 程序在尝试回滚第一个分支事务时报错: 2025-08-14 00:28:19.157 ERROR base/undo.go:391 execute on fail, err: sql: expected 12 destination arguments in Scan, not 7 github.com/seata/seata-go/pkg/datasource/sql/undo/base.(*BaseUndoLogManager).Undo 这个错误发生在AT模式进行脏数据检查,执行`SELECT * FROM products ...`并试图用`rows.Scan()`读取当前数据时。由于`GetScanSlice`函数的 Bug,返回的接收变量数量(7)与`products`表的实际列数(12)不匹配。 最终,事务回滚失败,`undo_log`未被删除,第一种商品的库存被错误地扣减,导致数据不一致。 ### 💡 可能的解决方案 最直接且改动最小的修复方案是在`GetScanSlice`函数的`switch`语句中增加一个`default`分支,以确保所有类型的列都能被正确处理,从而保证返回的切片长度与列数一致。 下面的代码可完整地复现这个Bug,并演示增加`default`分支后的修复效果: ```golang package main import ( "database/sql" "fmt" "reflect" ) var ( ScanTypeInt64 = reflect.TypeOf(int64(0)) ScanTypeUnknown = reflect.TypeOf(new(interface{})) ) var realWorldColumnTypes = []struct { SQLType string GoType reflect.Type }{ {"VARCHAR(255)", reflect.TypeOf("")}, {"TEXT", reflect.TypeOf(sql.NullString{})}, {"DECIMAL(10,2)", reflect.TypeOf("")}, {"BIGINT", ScanTypeInt64}, } func buggyGetScanSlice(types []reflect.Type) []interface{} { scanSlice := make([]interface{}, 0, len(types)) for _, tpy := range types { switch tpy { case ScanTypeInt64: scanVal := int64(0) scanSlice = append(scanSlice, &scanVal) case ScanTypeUnknown: scanVal := new(interface{}) scanSlice = append(scanSlice, scanVal) } } return scanSlice } func fixedGetScanSlice(types []reflect.Type) []interface{} { scanSlice := make([]interface{}, 0, len(types)) for _, tpy := range types { switch tpy { case ScanTypeInt64: scanVal := int64(0) scanSlice = append(scanSlice, &scanVal) case ScanTypeUnknown: scanVal := new(interface{}) scanSlice = append(scanSlice, scanVal) default: // 为所有其他类型提供一个通用的default分支 scanVal := new(interface{}) scanSlice = append(scanSlice, scanVal) } } return scanSlice } func main() { columnTypes := make([]reflect.Type, 0, len(realWorldColumnTypes)) for _, col := range realWorldColumnTypes { columnTypes = append(columnTypes, col.GoType) } fmt.Println("以下是Bug演示:") buggySlice := buggyGetScanSlice(columnTypes) fmt.Printf("输入列的数量: %d\n", len(columnTypes)) fmt.Printf("有Bug的函数输出的切片长度: %d\n", len(buggySlice)) if len(columnTypes) != len(buggySlice) { fmt.Println("长度不匹配,这将导致rows.Scan()失败。") } fmt.Println("\n以下是修复方案演示:") fixedSlice := fixedGetScanSlice(columnTypes) fmt.Printf("输入列的数量: %d\n", len(columnTypes)) fmt.Printf("修复后的函数输出的切片长度: %d\n", len(fixedSlice)) if len(columnTypes) == len(fixedSlice) { fmt.Println("长度匹配,rows.Scan()将会成功执行。") } } ``` 运行结果: <img width="1817" height="1080" alt="Image" src="https://github.com/user-attachments/assets/932dd91d-0fe8-4760-95ae-9d95dd761b15" /> -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: notifications-unsubscr...@seata.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: notifications-unsubscr...@seata.apache.org For additional commands, e-mail: notifications-h...@seata.apache.org