[GitHub] [arrow] HaykManukyanAvetiky opened a new issue #12413: Pyarrow write dataset ignores delimiter

GitBox Sun, 13 Feb 2022 02:11:35 -0800


HaykManukyanAvetiky opened a new issue #12413:
URL: https://github.com/apache/arrow/issues/12413



   Hi Guys.
   I tried to report this bug/possible bug with jira but I failed so writing 
here.
   I have a dataset and when I am trying to write it as tsv or tab separated 
file pyarrow  anyway writes csv.
   here is my code :
   ```python
   ds.write_dataset(data=table, base_dir='adapter/tsv/',
                    basename_template='my-unique-name-{i}.tsv', 
                    
format=ds.CsvFileFormat(parse_options=csv.ParseOptions(delimiter="\t")), 
                    partitioning=['month'],
                    existing_data_behavior='overwrite_or_ignore' )
   ```
   here is what I am getting 
   ```csv 
   "day","year"
   26,1958
   11,1912
   26,1942
   ```
   here is what I should get : 
   
   ```tsv
   day     year
   26      1958
   11      1912
   26      1942
   ```
   IT feels like pyarrow ignoring format or delimiter
   Thanks in advance


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow] HaykManukyanAvetiky opened a new issue #12413: Pyarrow write dataset ignores delimiter

Reply via email to