pitrou commented on a change in pull request #10505:
URL: https://github.com/apache/arrow/pull/10505#discussion_r651931713



##########
File path: cpp/src/arrow/csv/options.cc
##########
@@ -17,11 +17,32 @@
 
 #include "arrow/csv/options.h"
 
+#include <iomanip>
+
 namespace arrow {
 namespace csv {
 
 ParseOptions ParseOptions::Defaults() { return ParseOptions(); }
 
+Status ParseOptions::Validate() const {
+  if (ARROW_PREDICT_FALSE((delimiter < ' ' && delimiter != '\t') || delimiter 
> '~')) {
+    return Status::Invalid(
+        "ParseOptions: delimiter must be a printable ascii char or '\\t': 0x",
+        std::setfill('0'), std::setw(2), std::hex, 
static_cast<uint16_t>(delimiter));

Review comment:
       I don't think it's our duty to guard against dubious values. Perhaps 
some weird formats use a form-feed character as a delimiter, who knows?
   
   The only think that may be reasonable may be to forbid `\n` and `\r`. 
Otherwise we should just let the user choose whatever they like.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to