[GitHub] [arrow] sonomatechDS commented on issue #15056: [R][R Shiny] Deployed/published Rshiny app using arrow to query an AWS-hosted dataset causes intermittent "stack imbalance", "segfault", "memory not mapped" errors.

GitBox Thu, 12 Jan 2023 10:03:31 -0800


sonomatechDS commented on issue #15056:
URL: https://github.com/apache/arrow/issues/15056#issuecomment-1380796248


   @tnederlof that's a great thought! I had previously considered this option, 
but our full dataset takes a while (~8-10 s) to load via arrow::open_dataset() 
(possibly because of its size/number of partitions?). It would decrease 
usability if a user had to wait each time an input was changed. Am I 
understanding correctly that this would be the case?
   
   However, this idea could be viable if we abandon loading the full dataset 
each time and instead use inputs to build a more specific uri -- i.e. a subset 
of the dataset (e.g. calling open_dataset() on 
's3://bucket/sitecode=xxxx/parameter=xxxx/...', or even just read_arrow_csv() 
on a uri pointing to the desired csv).
   
   I'll do some testing on this today, and please let me know if you foresee 
any issues in the meantime.
   
   I appreciate your time and input!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow] sonomatechDS commented on issue #15056: [R][R Shiny] Deployed/published Rshiny app using arrow to query an AWS-hosted dataset causes intermittent "stack imbalance", "segfault", "memory not mapped" errors.

Reply via email to