[GitHub] [arrow] tnederlof commented on issue #15056: [R][R Shiny] Deployed/published Rshiny app using arrow to query an AWS-hosted dataset causes intermittent "stack imbalance", "segfault", "memory not mapped" errors.

GitBox Thu, 12 Jan 2023 13:27:10 -0800


tnederlof commented on issue #15056:
URL: https://github.com/apache/arrow/issues/15056#issuecomment-1381011491


   Thats kind of surprising to me it takes the long to open the dataset, I 
suspect all the partitioning is causing issues. I was able to replicate the 
issue you faced using the same partitioning structure (I just faked 24x times 
the data with different intervals). Then I tried saving all of the data in a 
single parquet file (its about 1gb) and now it runs in <0.5sec instead of 
8-9seconds.
   
   Could you please try saving the data as non hive partitioned parquet file(s)?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow] tnederlof commented on issue #15056: [R][R Shiny] Deployed/published Rshiny app using arrow to query an AWS-hosted dataset causes intermittent "stack imbalance", "segfault", "memory not mapped" errors.

Reply via email to