[
https://issues.apache.org/jira/browse/SPARK-17327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sean Owen resolved SPARK-17327.
-------------------------------
Resolution: Invalid
Fix Version/s: (was: 1.6.2)
Target Version/s: (was: 1.6.2)
Questions go to [email protected], but you'd need to narrow this down
further than "why does this take a certain amount of time?"
Read https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark
first.
> Throughput limitaion in spark standalone of simple task without calculation.
> ----------------------------------------------------------------------------
>
> Key: SPARK-17327
> URL: https://issues.apache.org/jira/browse/SPARK-17327
> Project: Spark
> Issue Type: Question
> Components: Java API, Windows
> Affects Versions: 1.6.2
> Environment: windows server 2008 R2 standard
> Reporter: xiefeng
> Labels: performance
>
> I install a spark standalone and run the spark cluster(one master and one
> worker) in a windows 2008 server with 16cores and 24GB memory.
> I have done a simple test: Just create a string RDD and simply return it. I
> use JMeter to test throughput but the highest is around 35/sec. I think spark
> is powerful at distribute calculation, but why the throughput is so limit in
> such simple test scenario only contains simple task dispatch and no
> calculation?
> 1. In JMeter I test both 10 threads or 100 threads, there is little
> difference around 2-3/sec.
> 2. I test both cache/not cache the RDD, there is little difference around
> 1-2/sec.
> 3. During the test, the cpu and memory is in low level.
> Below is my test code:
> @RestController
> public class SimpleTest {
> @RequestMapping(value = "/SimpleTest", method = RequestMethod.GET)
> @ResponseBody
> public String testProcessTransaction() {
> return SparkShardTest.simpleRDDTest();
> }
> }
> final static Map<String, JavaRDD<String>> simpleRDDs = initSimpleRDDs();
> public static Map<String, JavaRDD<String>> initSimpleRDDs()
> {
> Map<String, JavaRDD<String>> result = new
> ConcurrentHashMap<String, JavaRDD<String>>();
> JavaRDD<String> rddData = JavaSC.parallelize(data;
> rddData.cache().count(); //this cache will improve 1-2/sec
> result.put("MyRDD", rddData);
> return result;
> }
>
> public static String simpleRDDTest()
> {
> JavaRDD<String> rddData = simpleRDDs.get("MyRDD");
> return rddData.first();
> }
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]