David Li created ARROW-12598:
--------------------------------

             Summary: [C++][Dataset] Implement row-count for CSV or allow 
selecting 0 columns from CSV
                 Key: ARROW-12598
                 URL: https://issues.apache.org/jira/browse/ARROW-12598
             Project: Apache Arrow
          Issue Type: Improvement
          Components: C++
            Reporter: David Li


For ARROW-9697 file formats can implement a fast path to count rows in a 
fragment. For CSV this isn't implemented. We could do the equivalent of {{wc 
-l}} for CSV (using the lexing boundary finder as needed) and adjust the row 
count based on options for the header, or we could change the CSV reader 
options to allow selecting no columns (right now, passing no columns to the 
reader implies you want to read all columns). The former is likely faster but 
the latter will be more robust/less work.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to