zeroshade commented on PR #1590: URL: https://github.com/apache/arrow-adbc/pull/1590#issuecomment-2002553036
@joellubi I dug into this a bit and found something: If you follow into `PrepareDriverInfo` and go down to line 579, we grab the value using `array.String.Value`. In the debugger, we see that `v` gets the correct value and we pass it to `RegisterInfoCode`. After the next call to `rdr.Next`, the record we pulled the value from is released. When stepping through with gdb, the memory location where the string was held gets filled with garbage upon that release. My theory is ASAN is blowing it up... See, when we create the string from the underlying buffer we do an unsafe cast: `a.values = *(*string)(unsafe.Pointer(&b))` and then the `Value` method takes a slice from that string. When we release the array, which releases the buffer, I'm guessing ASAN is filling the bytes with garbage since it's not tracking things properly due to our unsafe casting here. If we're willing to drop Go 1.19 support in Arrow (which is safe now that Go 1.22 is out...) we can switch it to use `unsafe.String` instead which might be safer and solve this issue. Alternately, instead of needing a change in Arrow, we could just make sure that `PrepareDriverInfo` or `RegisterInfoCode` forces a copy of the string bytes and stores the copy in the map rather than referring to the same bytes. Since the string value for these info codes should be pretty small, it shouldn't be a problem, that should potentially fix the issue, though I haven't had the time to test that as a fix yet. Figured I'd report what I found though -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
