Hi! I think I have the maximally horrendous solution to this problem. If
you just want to know the total cores of a Standalone or Coarse Grained
scheduler, and are OK with going "off trail" of the public API, so to
speak, you can use something like the following (just beware that it's
liable to break in any future version of Spark):
import java.lang.reflect.Field;
import java.lang.reflect.Method;
import java.util.concurrent.atomic.AtomicInteger;
class Evil {
static Object invokeMethod(Object ref, Class<?> c, String method)
throws Exception {
Method m = c.getDeclaredMethod(method);
m.setAccessible(true);
return m.invoke(ref);
}
static Object getField(Object ref, Class<?> c, String field) throws
Exception {
Field f = c.getDeclaredField(field);
f.setAccessible(true);
return f.get(ref);
}
public static void main(String [] args) throws Exception {
SparkContext sc = // your SparkContext here
Object dagScheduler = invokeMethod(sc, sc.getClass(),
"dagScheduler");
Object taskSched = getField(dagScheduler, dagScheduler.getClass(),
"org$apache$spark$scheduler$DAGScheduler$$taskSched"); // how peculiar a
name!
Object backend = getField(taskSched, taskSched.getClass(),
"backend");
AtomicInteger count = (AtomicInteger) invokeMethod(backend,
backend.getClass().getSuperclass(), "totalCoreCount");
System.out.println("Number of cores = " + count);
}
}
Not only is this approach highly discouraged, it may make the faint-hearted
break down in tears.
On Thu, Nov 14, 2013 at 2:24 PM, <[email protected]> wrote:
> I would like to get info like total cores and total memory available in
> the spark cluster via an spark java api, any suggestion?
>
>
>
> This will help me in setting the right partition when invoking parallelize
> for example
>
>
>
> Thanks,
>
> Hussam
>