ghuls commented on PR #4434:
URL: https://github.com/apache/arrow-rs/pull/4434#issuecomment-1614623064

   > I am a little bit concerned that the flatbuffer table limit exists for a 
reason, e.g. to prevent a DOS vector. I don't feel confident that we should 
change the default settings, as I don't feel I have a good enough grasp as to 
why it is present. I therefore wonder if we can instead allow users to opt-in 
to looser validation behaviour?
   > 
   > As an aside I would not expect million column schemas to be a good idea in 
general, in the absence of extremely aggressive projection pushdown the 
performance will likely be poor. I would definitely encourage people with such 
schema to perhaps reconsider their schema design...
   
   As far as I understand the pyarrow fix for this issue checks the size of the 
footer to calculate the maximum number of tables that could be encoded in the 
footer (https://issues.apache.org/jira/browse/ARROW-11559, on which the arrow2 
implementation is based).
   
   I have quite a few (real) files that reach this limit of 1milion columns 
(max < 2.5 million). 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to