Re: [PR] [AURON #1739] Support LIMIT with OFFSET [auron]

via GitHub Wed, 21 Jan 2026 21:43:41 -0800


Copilot commented on code in PR #1740:
URL: https://github.com/apache/auron/pull/1740#discussion_r2715383057



##########
native-engine/datafusion-ext-plans/src/limit_exec.rs:
##########
@@ -226,4 +274,31 @@ mod test {
         assert_eq!(row_count, Precision::Exact(2));
         Ok(())
     }
+
+    #[tokio::test]
+    async fn test_limit_with_offset() -> Result<()> {
+        let input = build_table(
+            ("a", &vec![0, 1, 2, 3, 4, 5, 6, 7, 8, 9]),
+            ("b", &vec![9, 8, 7, 6, 5, 4, 3, 2, 1, 0]),
+            ("c", &vec![5, 6, 7, 8, 9, 0, 1, 2, 3, 4]),
+        )?;
+        let limit_exec = LimitExec::new(input, 7, 5);
+        let session_ctx = SessionContext::new();
+        let task_ctx = session_ctx.task_ctx();
+        let output = limit_exec.execute(0, task_ctx)?;
+        let batches = common::collect(output).await?;
+        let row_count: usize = batches.iter().map(|batch| 
batch.num_rows()).sum();
+
+        let expected = vec![
+            "+---+---+---+",
+            "| a | b | c |",
+            "+---+---+---+",
+            "| 5 | 4 | 0 |",
+            "| 6 | 3 | 1 |",
+            "+---+---+---+",
+        ];
+        assert_batches_eq!(expected, &batches);
+        assert_eq!(row_count, 2);
+        Ok(())

Review Comment:
   The test_limit_with_offset test has incorrect expectations. With LIMIT 7 
OFFSET 5 on a 10-row dataset, the expected result should be 5 rows (skip first 
5, return next 5 which is the min of 7 and remaining 5), not 2 rows.
   
   The input has rows [0,1,2,3,4,5,6,7,8,9]. After skipping 5 rows (0-4), we 
should return the next 7 rows if available. Since only 5 rows remain (5-9), all 
5 should be returned.
   
   The expected output should show rows with a=5,6,7,8,9 (5 rows total), not 
just a=5,6 (2 rows). This test is currently validating incorrect behavior and 
would not catch the bug in line 167.



##########
spark-extension/src/main/scala/org/apache/spark/sql/execution/auron/plan/NativeCollectLimitBase.scala:
##########
@@ -51,7 +51,8 @@ abstract class NativeCollectLimitBase(limit: Int, override 
val child: SparkPlan)
       val row = it.next().copy()
       buf += row
     }
-    buf.toArray
+    val rows = buf.toArray
+    if (offset > 0) rows.drop(offset) else rows

Review Comment:
   The while loop condition on line 50 checks `buf.size < limit`, which means 
it collects only `limit` rows. However, since an offset will be applied on line 
55, the loop should collect `limit + offset` rows to ensure enough rows remain 
after dropping the offset.
   
   For example, with LIMIT 5 OFFSET 3, the current code collects only 5 rows 
and then drops 3, leaving only 2 rows. The correct behavior should collect 8 
rows (5+3), then drop 3, leaving 5 rows.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] [AURON #1739] Support LIMIT with OFFSET [auron]

Reply via email to