BigQuery): Modify ReadChunk behavior [arrow-adbc]

via GitHub Thu, 21 Aug 2025 07:58:40 -0700


CurtHagenlocher commented on code in PR #3323:
URL: https://github.com/apache/arrow-adbc/pull/3323#discussion_r2291320120



##########
csharp/src/Drivers/BigQuery/BigQueryStatement.cs:
##########
@@ -604,12 +613,17 @@ sealed class ReadRowsStream : Stream
 
             public ReadRowsStream(IAsyncEnumerator<ReadRowsResponse> response)
             {
-                if (!response.MoveNextAsync().Result) { }
-
-                if (response.Current != null)
+                if (response.MoveNextAsync().Result)

Review Comment:
   The original code here looks super dodgy. Is it possible that this is the 
only thing that needs to be changed to resolve the issue? After all, if 
`MoveNext` had returned false then I'd expect accessing `Current` would throw 
exactly the message being reported.
   
   Orthogonally, this implementation can be simplified by merging the `if`s:
   ```
                   if (response.MoveNextAsync().Result && response.Current != 
null)
                   {
                       this.currentBuffer = 
response.Current.ArrowSchema.SerializedSchema.Memory;
                       this.hasRows = true;
                   }
                   else
                   {
                       this.hasRows = false;
                   }
   ```



##########
csharp/src/Drivers/BigQuery/BigQueryStatement.cs:
##########
@@ -342,22 +342,31 @@ private IArrowType GetType(TableFieldSchema field, 
IArrowType type)
 
         private static IArrowReader? ReadChunk(BigQueryReadClient client, 
string streamName, Activity? activity)
         {
-            // Ideally we wouldn't need to indirect through a stream, but the 
necessary APIs in Arrow
-            // are internal. (TODO: consider changing Arrow).
-            activity?.AddConditionalBigQueryTag("read_stream", streamName, 
BigQueryUtils.IsSafeToTrace());
-            BigQueryReadClient.ReadRowsStream readRowsStream = 
client.ReadRows(new ReadRowsRequest { ReadStream = streamName });
-            IAsyncEnumerator<ReadRowsResponse> enumerator = 
readRowsStream.GetResponseStream().GetAsyncEnumerator();
+            try
+            {
+                // Ideally we wouldn't need to indirect through a stream, but 
the necessary APIs in Arrow
+                // are internal. (TODO: consider changing Arrow).
+                activity?.AddConditionalBigQueryTag("read_stream", streamName, 
BigQueryUtils.IsSafeToTrace());
+                BigQueryReadClient.ReadRowsStream readRowsStream = 
client.ReadRows(new ReadRowsRequest { ReadStream = streamName });
+                IAsyncEnumerator<ReadRowsResponse> enumerator = 
readRowsStream.GetResponseStream().GetAsyncEnumerator();
 
-            ReadRowsStream stream = new ReadRowsStream(enumerator);
-            activity?.AddBigQueryTag("read_stream.has_rows", stream.HasRows);
+                ReadRowsStream stream = new ReadRowsStream(enumerator);
+                activity?.AddBigQueryTag("read_stream.has_rows", 
stream.HasRows);
 
-            if (stream.HasRows)
-            {
-                return new ArrowStreamReader(stream);
+                if (stream.HasRows)
+                {
+                    return new ArrowStreamReader(stream);
+                }
+                else
+                {
+                    return null;
+                }
             }
-            else
+            catch (Exception ex)
             {
-                return null;
+                activity?.AddException(ex);
+
+                return null; // If there is an error reading the stream, 
return null to indicate no data.

Review Comment:
   If there's e.g. a network error in the middle of reading the response, won't 
this simply truncate the response instead of returning an error? That seems 
strictly worse to me than failing.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] fix(csharp/src/Drivers/BigQuery): Modify ReadChunk behavior [arrow-adbc]

Reply via email to