thisisnic commented on pull request #12083:
URL: https://github.com/apache/arrow/pull/12083#issuecomment-1006618779


   Just pasting here the conversation from JIRA:
   
   > I did, however, run into trouble. Say, for example, the user has set 
skip_rows-option like this:
   `read_options=arrow::CsvReadOptions$create(skip_rows=1))`
   I image we'd like to keep whatever options the user has set when we 
re-create the `CsvReadOptions` object  with column names from the schema. The 
problem is that I cannot access `skip_rows` in the object after it's created, 
so I cannot use that information to create another instance of `CsvReadOptions` 
that has both the `column_names` and `skip_rows` set (plus any other options).
   
   > Any thoughts? Is there a way to access `skip_rows` and other attributes 
that I'm unaware of? Of course, one solution is to change class declaration of 
`CsvReadOptions` to have access to these attributes.
   
   Thanks for opening this draft PR!  After checking out a copy of your branch, 
I understand what's going on here and why it's not working a lot better not.  
   
   Currently, the approaches that come to mind for me are:
   1. as you say, update that class so we can modify contents after creation
   2. update the signature of `CsvFileFormat$create()`, setting `read_options` 
to a default value of `NULL` and then calling `csv_file_format_read_opts()` 
later in the function, passing in both the schema and any user-set values.
   
   I haven't fully fleshed out the second option or tested to see if it'll 
work, but if it does I'd be in favour of doing it that way so we can make the 
change we need without having to modify the structure of the existing classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to