Prathamesh9284 commented on PR #692: URL: https://github.com/apache/wayang/pull/692#issuecomment-3943791797
Hi @zkaoudi @mspruc, I've refactored `validateHeaderLine` to be `static` and moved it into `streamLines()`, so the file is only opened once. The header is consumed via the iterator before streaming data rows. Here are the error messages for each case: ### 1. Empty CSV file ``` CSV file 'customers.csv' is empty. Expected a header row (e.g., 'id:int,name:string'). ``` ### 2. Header missing types (e.g., `NAMEA,NAMEB,NAMEC`) ``` CSV file 'customers.csv': header column 'NAMEA' missing required type. Expected 'name:type' format (e.g., 'id:int'). Header: 'NAMEA,NAMEB,NAMEC'. ``` ### 3. Header uses wrong separator (e.g., `id:int;name:string;email:string;country:string`) ``` CSV file 'customers.csv': column count mismatch. Expected 4 comma-separated 'name:type' columns but found 1. Header: 'id:int;name:string;email:string;country:string'. ``` ### 4. Data row has wrong number of columns (e.g., `test1;1` in a 3-column table) ``` CSV file 'customers.csv': data row has 2 columns but expected 3 (separator ';'). Line: 'test1;1'. ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
