caldempsey commented on issue #448: URL: https://github.com/apache/arrow-go/issues/448#issuecomment-3110786681
@zeroshade Thanks! The weird thing is there's also a bug with the chunking in relation to Spark Connect. https://github.com/apache/spark-connect-go/issues/155. The reason I feel this might be related is the number of chunks set is _always_ equal to the number of rows in the final DF from `CreateDataFrameFromArrow` (Spark Connect Go) _only_ when we consume from JSON. `RecordFromJSON` becomes basically unusable because it only ever returns one row. That can't be intended behaviour, so something is broken on the Spark Connect Go side or on yours w.r.t chunking. So, this might be a symptom of a deeper issue in how those chunks are being organised (incremental reading into different chunks might be broken as _only_ the first chunk seems to be read into the final DF). I feel like you're going chunk by chunk (1 row per chunk), but I don't have the expertise in the underlying arrow format to really say. If you feel this is unrelated to this bug, can I ask to file a new issue here so someone more familiar with Arrow can help me work out if this is an `arrow-go` issue or a `spark-connect-go` issue? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org