[jira] [Created] (SPARK-17327) Throughput limitaion in spark standalone of simple task without calculation.

xiefeng (JIRA) Wed, 31 Aug 2016 00:33:12 -0700

xiefeng created SPARK-17327:
-------------------------------

             Summary: Throughput limitaion in spark standalone of simple task 
without calculation.
                 Key: SPARK-17327
                 URL: https://issues.apache.org/jira/browse/SPARK-17327
             Project: Spark
          Issue Type: Question
          Components: Java API, Windows
    Affects Versions: 1.6.2
         Environment: windows server 2008 R2 standard
            Reporter: xiefeng
             Fix For: 1.6.2



I install a spark standalone and run the spark cluster(one master and one 
worker) in a windows 2008 server with 16cores and 24GB memory.

I have done a simple test: Just create  a string RDD and simply return it. I 
use JMeter to test throughput but the highest is around 35/sec. I think spark 
is powerful at distribute calculation, but why the throughput is so limit in 
such simple test scenario only contains simple task dispatch and no calculation?

1. In JMeter I test both 10 threads or 100 threads, there is little difference 
around 2-3/sec.
2. I test both cache/not cache the RDD, there is little difference around 
1-2/sec. 
3. During the test, the cpu and memory is in low level.

Below is my test code:
@RestController
public class SimpleTest {       
        @RequestMapping(value = "/SimpleTest", method = RequestMethod.GET)
        @ResponseBody
        public String testProcessTransaction() {
                return SparkShardTest.simpleRDDTest();
        }
}

final static Map<String, JavaRDD<String>> simpleRDDs = initSimpleRDDs();
public static Map<String, JavaRDD<String>> initSimpleRDDs()
        {
                Map<String, JavaRDD<String>> result = new 
ConcurrentHashMap<String, JavaRDD<String>>();
                JavaRDD<String> rddData = JavaSC.parallelize(data;
                rddData.cache().count();    //this cache will improve 1-2/sec
                result.put("MyRDD", rddData);
                return result;
        }
        
        public static String simpleRDDTest()
        {               
                JavaRDD<String> rddData = simpleRDDs.get("MyRDD");
                return rddData.first();
        }




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Created] (SPARK-17327) Throughput limitaion in spark standalone of simple task without calculation.

Reply via email to