Hi All,

I am passing Java static methods into RDD transformations map and
mapValues. The first map is from a simple string K into a (K,V) where V is
a Java ArrayList of large text strings, 50K each, read from Cassandra.
MapValues does processing of these text blocks into very small ArrayLists.

The code runs quite slow compared to running it in parallel on the same
servers from plain Java.

I gave the same heap to Executors and Java. Does java run slower under
Spark or do I suffer from excess heap pressure or am I missing something?

Thank you for any insight,
Oleg

Reply via email to