Re: How shall I configure the Spark executor memory size and the Alluxio worker memory size on a machine?

2019-04-04 Thread Bin Fan
oops, sorry for the confusion. I mean "20% of the size of your input data set" allocated to Alluxio as memory resource as the starting point. after that, you can checkout the cache hit ratio into Alluxio space based on the metrics collected in Alluxio web UI

Re: How shall I configure the Spark executor memory size and the Alluxio worker memory size on a machine?

2019-04-04 Thread Bin Fan
Hi Andy, It really depends on your workloads. I would suggest to allocate 20% of the size of your input data set as the starting point and see how it works. Also depending on your data source as the under store of Alluxio, if it is remote (e.g., cloud storage like S3 or GCS), you can perhaps use

How shall I configure the Spark executor memory size and the Alluxio worker memory size on a machine?

2019-03-21 Thread u9g
Hey, We have a cluster of 10 nodes each of which consists 128GB memory. We are about to running Spark and Alluxio on the cluster. We wonder how shall allocate the memory to the Spark executor and the Alluxio worker on a machine? Are there some recommendations? Thanks! Best, Andy Li