tustvold opened a new issue, #3364:
URL: https://github.com/apache/arrow-rs/issues/3364

   **Describe the bug**
   <!--
   A clear and concise description of what the bug is.
   -->
   
   The following table illustrates the current behaviour
   
   |range|has_header|num_rows
   |---|---|---|
   |0..4|false|4|
   |0..4|true|3|
   |1..4|false|3|
   |1..4|true|2|
   
   The first two could be justified if the start offset included the header, 
but this is inconsistent with the two subsequent results
   
   **To Reproduce**
   <!--
   Steps to reproduce the behavior:
   -->
   
   ```
   #[test]
       fn test_header_bounds() {
           let csv = "a,b\na,b\na,b\na,b\na,b\n";
           let tests = [
               (None, false, 5),
               (None, true, 4),
               (Some((0, 4)), false, 4),
               (Some((1, 4)), false, 3),
               (Some((0, 4)), true, 4),
               (Some((1, 4)), true, 3),
           ];
   
           for (bounds, has_header, expected) in tests {
               let mut reader = ReaderBuilder::new().has_header(has_header);
               if let Some((start, end)) = bounds {
                   reader = reader.with_bounds(start, end);
               }
               let b = reader
                   .build(Cursor::new(csv.as_bytes()))
                   .unwrap()
                   .next()
                   .unwrap()
                   .unwrap();
               assert_eq!(b.num_rows(), expected);
           }
       }
   ```
   
   **Expected behavior**
   <!--
   A clear and concise description of what you expected to happen.
   -->
   
   The bounds shouldn't include the header, i.e. an offset of 1 will return the 
2nd row of data irrespective of the presence or not of a header.
   
   **Additional context**
   <!--
   Add any other context about the problem here.
   -->


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to