Hi Can't understand your question exactly, do you want to increase parallelism? If yes: You can set Spark's parallelism parameter
Regards Liang 2017-06-20 11:41 GMT+08:00 suzzy <[email protected]>: > Hi > Running query 'select count(1) from sunzy.datatest' > this job had 16 blocks and 16 tasks, but only 4 partitions > how to add RDD partition? > thanks > > CarbonData ThriftServer Log: > > INFO 16-06 16:14:34,039 - > Identified no.of.blocks: 16, > no.of.tasks: 16, > no.of.nodes: 0, > parallelism: 4 > INFO 16-06 16:14:34,059 - Starting job: run at AccessController.java:-2 > INFO 16-06 16:14:34,060 - Registering RDD 12 (run at > AccessController.java:-2) > INFO 16-06 16:14:34,061 - Got job 1 (run at AccessController.java:-2) with > 1 > output partitions > INFO 16-06 16:14:34,061 - Final stage: ResultStage 3 (run at > AccessController.java:-2) > INFO 16-06 16:14:34,061 - Parents of final stage: List(ShuffleMapStage 2) > INFO 16-06 16:14:34,061 - Missing parents: List(ShuffleMapStage 2) > INFO 16-06 16:14:34,062 - Submitting ShuffleMapStage 2 > (MapPartitionsRDD[12] > at run at AccessController.java:-2), which has no missing parents > INFO 16-06 16:14:34,065 - Block broadcast_2 stored as values in memory > (estimated size 15.4 KB, free 62.2 KB) > INFO 16-06 16:14:34,068 - Block broadcast_2_piece0 stored as bytes in > memory > (estimated size 7.6 KB, free 69.8 KB) > INFO 16-06 16:14:34,069 - Added broadcast_2_piece0 in memory on > 192.168.1.41:57617 (size: 7.6 KB, free: 71.7 GB) > INFO 16-06 16:14:34,069 - Created broadcast 2 from broadcast at > DAGScheduler.scala:1006 > INFO 16-06 16:14:34,070 - Submitting 16 missing tasks from ShuffleMapStage > 2 > (MapPartitionsRDD[12] at run at AccessController.java:-2) > INFO 16-06 16:14:34,070 - Adding task set 2.0 with 16 tasks > INFO 16-06 16:14:34,072 - Starting task 2.0 in stage 2.0 (TID 16, H4, > partition 2,NODE_LOCAL, 2376 bytes) > INFO 16-06 16:14:34,073 - Starting task 0.0 in stage 2.0 (TID 17, H3, > partition 0,NODE_LOCAL, 2376 bytes) > INFO 16-06 16:14:34,073 - Starting task 1.0 in stage 2.0 (TID 18, H1, > partition 1,NODE_LOCAL, 2376 bytes) > INFO 16-06 16:14:34,074 - Starting task 4.0 in stage 2.0 (TID 19, H2, > partition 4,NODE_LOCAL, 2376 bytes) > INFO 16-06 16:14:34,089 - Added broadcast_2_piece0 in memory on H1:57002 > (size: 7.6 KB, free: 57.3 GB) > INFO 16-06 16:14:34,096 - Added broadcast_2_piece0 in memory on H4:33086 > (size: 7.6 KB, free: 57.3 GB) > INFO 16-06 16:14:34,116 - Added broadcast_2_piece0 in memory on H2:45618 > (size: 7.6 KB, free: 57.3 GB) > INFO 16-06 16:14:34,117 - Added broadcast_2_piece0 in memory on H3:56719 > (size: 7.6 KB, free: 57.3 GB) > > > > -- > View this message in context: http://apache-carbondata-user- > mailing-list.3231.n8.nabble.com/how-to-add-RDD-partition-tp31.html > Sent from the Apache CarbonData User Mailing List mailing list archive at > Nabble.com. >
