That will help a lot for new users. They don't need to worry how to get the best efficiency :-)
On Thu, Dec 22, 2016 at 4:50 PM, Mike Carey <[email protected]> wrote: > Exactly; I was just thinking of how to lazily set the parameters without > counting disks or cores - i.e., of how to just ask the system to run > everything with max-parallelism. > > > > On 12/22/16 2:51 PM, Yingyi Bu wrote: > >> Yes, that would work. >> However, if the cluster only has 128 available cores, it will use 128-way >> (instead of 1000) parallelism for the computation. >> >> Best, >> Yingyi >> >> On Thu, Dec 22, 2016 at 12:40 PM, Mike Carey <[email protected]> wrote: >> >> So to use maxi-storage parallelism for scans but maxi-core parallelism >>> subsequently, one would just specify a large positive value >= the number >>> of available cores? (E.g., 10000) >>> >>> >>> >>> On 12/22/16 11:37 AM, Yingyi Bu wrote: >>> >>> No need to reload data anymore :-) >>>> >>>> Best, >>>> Yingyi >>>> >>>> On Thu, Dec 22, 2016 at 11:36 AM, Yingyi Bu <[email protected]> wrote: >>>> >>>> Indeed, the change was merged yesterday. >>>> >>>>> If you grab the latest master, the computation parallelism can be set >>>>> by >>>>> the parameter compiler.parallelism: >>>>> -- 0, the default, means to use the storage parallelism as computation >>>>> parallelism >>>>> -- negative value, means to use all available cores in the cluster >>>>> -- positive value, means to that number of cores, but it will fall to >>>>> back >>>>> to use all available cores if the number is too large. >>>>> >>>>> Best, >>>>> Yingyi >>>>> >>>>> >>>>> On Thu, Dec 22, 2016 at 10:59 AM, Mike Carey <[email protected]> >>>>> wrote: >>>>> >>>>> It would definitely also be fun to see how the newest more flexible >>>>> >>>>>> parallelism stuff handles this! (Where you'd specify storage >>>>>> parallelism >>>>>> based on drives, and compute parallelism based on cores, both spread >>>>>> across >>>>>> all of the cluster's resources.) >>>>>> >>>>>> >>>>>> On 12/22/16 10:57 AM, Yingyi Bu wrote: >>>>>> >>>>>> Mingda, >>>>>> >>>>>>> >>>>>>> There is one more things that you can do without the need to >>>>>>> reload >>>>>>> your data: >>>>>>> >>>>>>> -- Enlarge the buffercache memory budget to 8GB so that the >>>>>>> datasets >>>>>>> can fit into the buffer cache: >>>>>>> "storage.buffercache.size": 536870912 >>>>>>> -> >>>>>>> "storage.buffercache.size": 8589934592 >>>>>>> >>>>>>> You don't need to reload data but only need to restart the >>>>>>> AsterixDB >>>>>>> instance. >>>>>>> Thanks! >>>>>>> >>>>>>> Best, >>>>>>> Yingyi >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Wed, Dec 21, 2016 at 9:22 PM, Mike Carey <[email protected]> >>>>>>> wrote: >>>>>>> >>>>>>> Nice!! >>>>>>> >>>>>>> On Dec 21, 2016 8:43 PM, "Yingyi Bu" <[email protected]> wrote: >>>>>>>> >>>>>>>> Cool, thanks, Mingda! >>>>>>>> >>>>>>>> Look forward to the new numbers! >>>>>>>>> >>>>>>>>> Best, >>>>>>>>> Yingyi >>>>>>>>> >>>>>>>>> On Wed, Dec 21, 2016 at 7:13 PM, mingda li <[email protected] >>>>>>>>> > >>>>>>>>> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>> Dear Yingyi, >>>>>>>> >>>>>>>>> Thanks for your suggestion. I have reset the AsterixDB and retest >>>>>>>>>> >>>>>>>>>> AsterixDB >>>>>>>>>> >>>>>>>>> using new environment on 100G, 10G. The efficiency for all the >>>>>>>>> tests >>>>>>>>> >>>>>>>>>> (good >>>>>>>>>> >>>>>>>>> and bad order) have been all improved to twice speed. >>>>>>>>> >>>>>>>>>> I will finish all the tests and update the result later. >>>>>>>>>> >>>>>>>>>> Bests, >>>>>>>>>> Mingda >>>>>>>>>> >>>>>>>>>> On Tue, Dec 20, 2016 at 10:20 PM, Yingyi Bu <[email protected]> >>>>>>>>>> >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>> Hi Mingda, >>>>>>>>> >>>>>>>>> I think that in your setting, a better configuration for >>>>>>>>>> >>>>>>>>>>> AsterixDB >>>>>>>>>>> >>>>>>>>>> might be to use 64 partitions, i.e., 4 cores *16. >>>>>>>>> >>>>>>>>> 1. To achieve that, you have to have 4 iodevices on each >>>>>>>>>> NC, >>>>>>>>>> >>>>>>>>>>> e.g.: >>>>>>>>>>> >>>>>>>>>> iodevices=/home/clash/asterixStorage/asterixdb5/red16 >>>>>>>>> >>>>>>>>> --> >>>>>>>>>> >>>>>>>>>>> iodevices=/home/clash/asterix >>>>>>>>>>> Storage/asterixdb5/red16-1, >>>>>>>>>>> iodevices=/home/clash/asterixStorage/asterixdb5/red16-2, >>>>>>>>>>> iodevices=/home/clash/asterixStorage/asterixdb5/red16-3, >>>>>>>>>>> iodevices=/home/clash/asterixStorage/asterixdb5/red16-4 >>>>>>>>>>> >>>>>>>>>>> 2. Assume you have 64 partitions, 31.01G/64 ~= 0.5G. >>>>>>>>>>> That >>>>>>>>>>> means >>>>>>>>>>> >>>>>>>>>>> you'd >>>>>>>>>>> >>>>>>>>>> better have 512MB memory budget for each joiner so as to make the >>>>>>>>>> >>>>>>>>>>> join >>>>>>>>>>> >>>>>>>>>> memory-resident. >>>>>>>>> >>>>>>>>> To achieve that, in the cc section in the cc.conf, you >>>>>>>>>> >>>>>>>>>>> could >>>>>>>>>>> >>>>>>>>>> add: >>>>>>>>> >>>>>>>>> compiler.joinmemory=536870912 >>>>>>>>>> >>>>>>>>>>> 3. For the JVM setting, 1024MB is too small for the NC. >>>>>>>>>>> In the shared NC section in cc.conf, you can add: >>>>>>>>>>> jvm.args=-Xmx16G >>>>>>>>>>> >>>>>>>>>>> 4. For Pig and Hive, you can set the maximum >>>>>>>>>>> mapper/reducer >>>>>>>>>>> >>>>>>>>>>> numbers >>>>>>>>>>> >>>>>>>>>> in >>>>>>>>>> >>>>>>>>>> the MapReduce configuration, e.g., at most 4 mappers per machine >>>>>>>>>> and >>>>>>>>>> >>>>>>>>>>> at >>>>>>>>>>> >>>>>>>>>> most 4 reducers per machine. >>>>>>>>> >>>>>>>>> 5. I'm not super-familiar with hyper threading, but it >>>>>>>>>> might >>>>>>>>>> >>>>>>>>>>> be >>>>>>>>>>> >>>>>>>>>>> worth >>>>>>>>>>> >>>>>>>>>> trying 8 partitions per machine, i.e., 128 partitions in total. >>>>>>>>>> >>>>>>>>>>> To validate if the new settings work, you can go to the >>>>>>>>>>> >>>>>>>>>>> admin/cluster >>>>>>>>>>> >>>>>>>>>> page to double check. >>>>>>>>>> >>>>>>>>>>> Pls keep us updated and let us know if you run into any >>>>>>>>>>> issue. >>>>>>>>>>> Thanks! >>>>>>>>>>> >>>>>>>>>>> Best, >>>>>>>>>>> Yingyi >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Tue, Dec 20, 2016 at 9:26 PM, mingda li < >>>>>>>>>>> [email protected] >>>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>> Dear Yingyi, >>>>>>>>>> >>>>>>>>>>> 1. For the returned of :http://<master >>>>>>>>>>>> node>:19002/admin/cluster >>>>>>>>>>>> >>>>>>>>>>>> { >>>>>>>>>>>> "cc": { >>>>>>>>>>>> "configUri": "http://scai01.cs.ucla.edu: >>>>>>>>>>>> 19002/admin/cluster/cc/config", >>>>>>>>>>>> "statsUri": "http://scai01.cs.ucla.edu: >>>>>>>>>>>> 19002/admin/cluster/cc/stats", >>>>>>>>>>>> "threadDumpUri": >>>>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/cc/threaddump" >>>>>>>>>>>> }, >>>>>>>>>>>> "config": { >>>>>>>>>>>> "api.port": 19002, >>>>>>>>>>>> "cc.java.opts": "-Xmx1024m", >>>>>>>>>>>> "cluster.partitions": { >>>>>>>>>>>> "0": "ID:0, Original Node: red16, IODevice: 0, >>>>>>>>>>>> Active >>>>>>>>>>>> >>>>>>>>>>>> Node: >>>>>>>>>>>> >>>>>>>>>>> red16", >>>>>>>>>> >>>>>>>>>> "1": "ID:1, Original Node: red15, IODevice: 0, >>>>>>>>>>> Active >>>>>>>>>>> >>>>>>>>>>>> Node: >>>>>>>>>>>> >>>>>>>>>>> red15", >>>>>>>>>> >>>>>>>>>> "2": "ID:2, Original Node: red14, IODevice: 0, >>>>>>>>>>> Active >>>>>>>>>>> >>>>>>>>>>>> Node: >>>>>>>>>>>> >>>>>>>>>>> red14", >>>>>>>>>> >>>>>>>>>> "3": "ID:3, Original Node: red13, IODevice: 0, >>>>>>>>>>> Active >>>>>>>>>>> >>>>>>>>>>>> Node: >>>>>>>>>>>> >>>>>>>>>>> red13", >>>>>>>>>> >>>>>>>>>> "4": "ID:4, Original Node: red12, IODevice: 0, >>>>>>>>>>> Active >>>>>>>>>>> >>>>>>>>>>>> Node: >>>>>>>>>>>> >>>>>>>>>>> red12", >>>>>>>>>> >>>>>>>>>> "5": "ID:5, Original Node: red11, IODevice: 0, >>>>>>>>>>> Active >>>>>>>>>>> >>>>>>>>>>>> Node: >>>>>>>>>>>> >>>>>>>>>>> red11", >>>>>>>>>> >>>>>>>>>> "6": "ID:6, Original Node: red10, IODevice: 0, >>>>>>>>>>> Active >>>>>>>>>>> >>>>>>>>>>>> Node: >>>>>>>>>>>> >>>>>>>>>>> red10", >>>>>>>>>> >>>>>>>>>> "7": "ID:7, Original Node: red9, IODevice: 0, >>>>>>>>>>> Active >>>>>>>>>>> >>>>>>>>>>>> Node: >>>>>>>>>>>> >>>>>>>>>>> red9", >>>>>>>>>> >>>>>>>>>> "8": "ID:8, Original Node: red8, IODevice: 0, >>>>>>>>>>> Active >>>>>>>>>>> >>>>>>>>>>>> Node: >>>>>>>>>>>> >>>>>>>>>>> red8", >>>>>>>>>> >>>>>>>>>> "9": "ID:9, Original Node: red7, IODevice: 0, >>>>>>>>>>> Active >>>>>>>>>>> >>>>>>>>>>>> Node: >>>>>>>>>>>> >>>>>>>>>>> red7", >>>>>>>>>> >>>>>>>>>> "10": "ID:10, Original Node: red6, IODevice: 0, >>>>>>>>>>> >>>>>>>>>>>> Active >>>>>>>>>>>> >>>>>>>>>>>> Node: >>>>>>>>>>>> >>>>>>>>>>> red6", >>>>>>>>>>> >>>>>>>>>>> "11": "ID:11, Original Node: red5, IODevice: 0, >>>>>>>>>>>> Active >>>>>>>>>>>> >>>>>>>>>>>> Node: >>>>>>>>>>>> >>>>>>>>>>> red5", >>>>>>>>>>> >>>>>>>>>>> "12": "ID:12, Original Node: red4, IODevice: 0, >>>>>>>>>>>> Active >>>>>>>>>>>> >>>>>>>>>>>> Node: >>>>>>>>>>>> >>>>>>>>>>> red4", >>>>>>>>>>> >>>>>>>>>>> "13": "ID:13, Original Node: red3, IODevice: 0, >>>>>>>>>>>> Active >>>>>>>>>>>> >>>>>>>>>>>> Node: >>>>>>>>>>>> >>>>>>>>>>> red3", >>>>>>>>>>> >>>>>>>>>>> "14": "ID:14, Original Node: red2, IODevice: 0, >>>>>>>>>>>> Active >>>>>>>>>>>> >>>>>>>>>>>> Node: >>>>>>>>>>>> >>>>>>>>>>> red2", >>>>>>>>>>> >>>>>>>>>>> "15": "ID:15, Original Node: red, IODevice: 0, >>>>>>>>>>>> Active >>>>>>>>>>>> >>>>>>>>>>>> Node: >>>>>>>>>>>> >>>>>>>>>>> red" >>>>>>>>>> >>>>>>>>>> }, >>>>>>>>>>> >>>>>>>>>>>> "compiler.framesize": 32768, >>>>>>>>>>>> "compiler.groupmemory": 33554432, >>>>>>>>>>>> "compiler.joinmemory": 33554432, >>>>>>>>>>>> "compiler.pregelix.home": "~/pregelix", >>>>>>>>>>>> "compiler.sortmemory": 33554432, >>>>>>>>>>>> "core.dump.paths": { >>>>>>>>>>>> "red": "/home/clash/asterixStorage/ >>>>>>>>>>>> >>>>>>>>>>>> asterixdb5/red/coredump", >>>>>>>>>>>> >>>>>>>>>>> "red10": "/home/clash/asterixStorage/ >>>>>>>>>>> >>>>>>>>>>> asterixdb5/red10/coredump", >>>>>>>>>>>> "red11": "/home/clash/asterixStorage/ >>>>>>>>>>>> asterixdb5/red11/coredump", >>>>>>>>>>>> "red12": "/home/clash/asterixStorage/ >>>>>>>>>>>> asterixdb5/red12/coredump", >>>>>>>>>>>> "red13": "/home/clash/asterixStorage/ >>>>>>>>>>>> asterixdb5/red13/coredump", >>>>>>>>>>>> "red14": "/home/clash/asterixStorage/ >>>>>>>>>>>> asterixdb5/red14/coredump", >>>>>>>>>>>> "red15": "/home/clash/asterixStorage/ >>>>>>>>>>>> asterixdb5/red15/coredump", >>>>>>>>>>>> "red16": "/home/clash/asterixStorage/ >>>>>>>>>>>> asterixdb5/red16/coredump", >>>>>>>>>>>> "red2": "/home/clash/asterixStorage/ >>>>>>>>>>>> >>>>>>>>>>>> asterixdb5/red2/coredump", >>>>>>>>>>>> >>>>>>>>>>> "red3": "/home/clash/asterixStorage/ >>>>>>>>>>> >>>>>>>>>>>> asterixdb5/red3/coredump", >>>>>>>>>>>> >>>>>>>>>>> "red4": "/home/clash/asterixStorage/ >>>>>>>>>>> >>>>>>>>>>>> asterixdb5/red4/coredump", >>>>>>>>>>>> >>>>>>>>>>> "red5": "/home/clash/asterixStorage/ >>>>>>>>>>> >>>>>>>>>>>> asterixdb5/red5/coredump", >>>>>>>>>>>> >>>>>>>>>>> "red6": "/home/clash/asterixStorage/ >>>>>>>>>>> >>>>>>>>>>>> asterixdb5/red6/coredump", >>>>>>>>>>>> >>>>>>>>>>> "red7": "/home/clash/asterixStorage/ >>>>>>>>>>> >>>>>>>>>>>> asterixdb5/red7/coredump", >>>>>>>>>>>> >>>>>>>>>>> "red8": "/home/clash/asterixStorage/ >>>>>>>>>>> >>>>>>>>>>>> asterixdb5/red8/coredump", >>>>>>>>>>>> >>>>>>>>>>> "red9": "/home/clash/asterixStorage/ >>>>>>>>>>> >>>>>>>>>>>> asterixdb5/red9/coredump" >>>>>>>>>>>> >>>>>>>>>>> }, >>>>>>>>>>> >>>>>>>>>>>> "feed.central.manager.port": 4500, >>>>>>>>>>>> "feed.max.threshold.period": 5, >>>>>>>>>>>> "feed.memory.available.wait.timeout": 10, >>>>>>>>>>>> "feed.memory.global.budget": 67108864, >>>>>>>>>>>> "feed.pending.work.threshold": 50, >>>>>>>>>>>> "feed.port": 19003, >>>>>>>>>>>> "instance.name": "DEFAULT_INSTANCE", >>>>>>>>>>>> "log.level": "WARNING", >>>>>>>>>>>> "max.wait.active.cluster": 60, >>>>>>>>>>>> "metadata.callback.port": 0, >>>>>>>>>>>> "metadata.node": "red16", >>>>>>>>>>>> "metadata.partition": "ID:0, Original Node: red16, >>>>>>>>>>>> >>>>>>>>>>>> IODevice: >>>>>>>>>>>> >>>>>>>>>>> 0, Active Node: red16", >>>>>>>>>> "metadata.port": 0, >>>>>>>>>> >>>>>>>>>>> "metadata.registration.timeout.secs": 60, >>>>>>>>>>>> "nc.java.opts": "-Xmx1024m", >>>>>>>>>>>> "node.partitions": { >>>>>>>>>>>> "red": ["ID:15, Original Node: red, IODevice: 0, >>>>>>>>>>>> Active >>>>>>>>>>>> >>>>>>>>>>>> Node: >>>>>>>>>>>> >>>>>>>>>>> red"], >>>>>>>>>>> >>>>>>>>>>> "red10": ["ID:6, Original Node: red10, IODevice: >>>>>>>>>>>> 0, >>>>>>>>>>>> >>>>>>>>>>>> Active >>>>>>>>>>>> >>>>>>>>>>> Node: red10"], >>>>>>>>>> >>>>>>>>>> "red11": ["ID:5, Original Node: red11, IODevice: 0, >>>>>>>>>>> >>>>>>>>>>>> Active >>>>>>>>>>>> >>>>>>>>>>> Node: red11"], >>>>>>>>>> >>>>>>>>>> "red12": ["ID:4, Original Node: red12, IODevice: 0, >>>>>>>>>>> >>>>>>>>>>>> Active >>>>>>>>>>>> >>>>>>>>>>> Node: red12"], >>>>>>>>>> >>>>>>>>>> "red13": ["ID:3, Original Node: red13, IODevice: 0, >>>>>>>>>>> >>>>>>>>>>>> Active >>>>>>>>>>>> >>>>>>>>>>> Node: red13"], >>>>>>>>>> >>>>>>>>>> "red14": ["ID:2, Original Node: red14, IODevice: 0, >>>>>>>>>>> >>>>>>>>>>>> Active >>>>>>>>>>>> >>>>>>>>>>> Node: red14"], >>>>>>>>>> >>>>>>>>>> "red15": ["ID:1, Original Node: red15, IODevice: 0, >>>>>>>>>>> >>>>>>>>>>>> Active >>>>>>>>>>>> >>>>>>>>>>> Node: red15"], >>>>>>>>>> >>>>>>>>>> "red16": ["ID:0, Original Node: red16, IODevice: 0, >>>>>>>>>>> >>>>>>>>>>>> Active >>>>>>>>>>>> >>>>>>>>>>> Node: red16"], >>>>>>>>>> >>>>>>>>>> "red2": ["ID:14, Original Node: red2, IODevice: 0, >>>>>>>>>>> >>>>>>>>>>>> Active >>>>>>>>>>>> >>>>>>>>>>> Node: red2"], >>>>>>>>>> "red3": ["ID:13, Original Node: red3, IODevice: 0, >>>>>>>>>> >>>>>>>>>>> Active >>>>>>>>>>>> >>>>>>>>>>> Node: red3"], >>>>>>>>>> "red4": ["ID:12, Original Node: red4, IODevice: 0, >>>>>>>>>> >>>>>>>>>>> Active >>>>>>>>>>>> >>>>>>>>>>> Node: red4"], >>>>>>>>>> "red5": ["ID:11, Original Node: red5, IODevice: 0, >>>>>>>>>> >>>>>>>>>>> Active >>>>>>>>>>>> >>>>>>>>>>> Node: red5"], >>>>>>>>>> "red6": ["ID:10, Original Node: red6, IODevice: 0, >>>>>>>>>> >>>>>>>>>>> Active >>>>>>>>>>>> >>>>>>>>>>> Node: red6"], >>>>>>>>>> "red7": ["ID:9, Original Node: red7, IODevice: 0, >>>>>>>>>> >>>>>>>>>>> Active >>>>>>>>>>>> >>>>>>>>>>> Node: red7"], >>>>>>>>>> "red8": ["ID:8, Original Node: red8, IODevice: 0, >>>>>>>>>> >>>>>>>>>>> Active >>>>>>>>>>>> >>>>>>>>>>> Node: red8"], >>>>>>>>>> "red9": ["ID:7, Original Node: red9, IODevice: 0, >>>>>>>>>> >>>>>>>>>>> Active >>>>>>>>>>>> >>>>>>>>>>> Node: red9"] >>>>>>>>>> }, >>>>>>>>>> >>>>>>>>>>> "node.stores": { >>>>>>>>>>>> "red": ["/home/clash/asterixStorage/ >>>>>>>>>>>> >>>>>>>>>>>> asterixdb5/red/storage"], >>>>>>>>>>>> >>>>>>>>>>> "red10": ["/home/clash/asterixStorage/ >>>>>>>>>>> >>>>>>>>>>>> asterixdb5/red10/storage"], >>>>>>>>>>>> "red11": ["/home/clash/asterixStorage/ >>>>>>>>>>>> asterixdb5/red11/storage"], >>>>>>>>>>>> "red12": ["/home/clash/asterixStorage/ >>>>>>>>>>>> asterixdb5/red12/storage"], >>>>>>>>>>>> "red13": ["/home/clash/asterixStorage/ >>>>>>>>>>>> asterixdb5/red13/storage"], >>>>>>>>>>>> "red14": ["/home/clash/asterixStorage/ >>>>>>>>>>>> asterixdb5/red14/storage"], >>>>>>>>>>>> "red15": ["/home/clash/asterixStorage/ >>>>>>>>>>>> asterixdb5/red15/storage"], >>>>>>>>>>>> "red16": ["/home/clash/asterixStorage/ >>>>>>>>>>>> asterixdb5/red16/storage"], >>>>>>>>>>>> "red2": ["/home/clash/asterixStorage/ >>>>>>>>>>>> asterixdb5/red2/storage"], >>>>>>>>>>>> "red3": ["/home/clash/asterixStorage/ >>>>>>>>>>>> asterixdb5/red3/storage"], >>>>>>>>>>>> "red4": ["/home/clash/asterixStorage/ >>>>>>>>>>>> asterixdb5/red4/storage"], >>>>>>>>>>>> "red5": ["/home/clash/asterixStorage/ >>>>>>>>>>>> asterixdb5/red5/storage"], >>>>>>>>>>>> "red6": ["/home/clash/asterixStorage/ >>>>>>>>>>>> asterixdb5/red6/storage"], >>>>>>>>>>>> "red7": ["/home/clash/asterixStorage/ >>>>>>>>>>>> asterixdb5/red7/storage"], >>>>>>>>>>>> "red8": ["/home/clash/asterixStorage/ >>>>>>>>>>>> asterixdb5/red8/storage"], >>>>>>>>>>>> "red9": ["/home/clash/asterixStorage/ >>>>>>>>>>>> >>>>>>>>>>>> asterixdb5/red9/storage"] >>>>>>>>>>>> >>>>>>>>>>> }, >>>>>>>>>>> >>>>>>>>>>>> "plot.activate": false, >>>>>>>>>>>> "replication.enabled": false, >>>>>>>>>>>> "replication.factor": 2, >>>>>>>>>>>> "replication.log.batchsize": 4096, >>>>>>>>>>>> "replication.log.buffer.numpages": 8, >>>>>>>>>>>> "replication.log.buffer.pagesize": 131072, >>>>>>>>>>>> "replication.max.remote.recovery.attempts": 5, >>>>>>>>>>>> "replication.timeout": 30, >>>>>>>>>>>> "storage.buffercache.maxopenfiles": 2147483647, >>>>>>>>>>>> "storage.buffercache.pagesize": 131072, >>>>>>>>>>>> "storage.buffercache.size": 536870912, >>>>>>>>>>>> "storage.lsm.bloomfilter.falsepositiverate": 0.01, >>>>>>>>>>>> "storage.memorycomponent.globalbudget": 536870912, >>>>>>>>>>>> "storage.memorycomponent.numcomponents": 2, >>>>>>>>>>>> "storage.memorycomponent.numpages": 256, >>>>>>>>>>>> "storage.memorycomponent.pagesize": 131072, >>>>>>>>>>>> "storage.metadata.memorycomponent.numpages": 256, >>>>>>>>>>>> "transaction.log.dirs": { >>>>>>>>>>>> "red": "/home/clash/asterixStorage/ >>>>>>>>>>>> >>>>>>>>>>>> asterixdb5/red/txnlog", >>>>>>>>>>>> >>>>>>>>>>> "red10": "/home/clash/asterixStorage/ >>>>>>>>>> >>>>>>>>>> asterixdb5/red10/txnlog", >>>>>>>>>>> >>>>>>>>>>> "red11": "/home/clash/asterixStorage/ >>>>>>>>>>> >>>>>>>>>>>> asterixdb5/red11/txnlog", >>>>>>>>>>>> >>>>>>>>>>> "red12": "/home/clash/asterixStorage/ >>>>>>>>>>> >>>>>>>>>>>> asterixdb5/red12/txnlog", >>>>>>>>>>>> >>>>>>>>>>> "red13": "/home/clash/asterixStorage/ >>>>>>>>>>> >>>>>>>>>>>> asterixdb5/red13/txnlog", >>>>>>>>>>>> >>>>>>>>>>> "red14": "/home/clash/asterixStorage/ >>>>>>>>>>> >>>>>>>>>>>> asterixdb5/red14/txnlog", >>>>>>>>>>>> >>>>>>>>>>> "red15": "/home/clash/asterixStorage/ >>>>>>>>>>> >>>>>>>>>>>> asterixdb5/red15/txnlog", >>>>>>>>>>>> >>>>>>>>>>> "red16": "/home/clash/asterixStorage/ >>>>>>>>>>> >>>>>>>>>>>> asterixdb5/red16/txnlog", >>>>>>>>>>>> >>>>>>>>>>> "red2": "/home/clash/asterixStorage/ >>>>>>>>>>> >>>>>>>>>>>> asterixdb5/red2/txnlog", >>>>>>>>>>>> >>>>>>>>>>> "red3": "/home/clash/asterixStorage/ >>>>>>>>>>> asterixdb5/red3/txnlog", >>>>>>>>>>> "red4": "/home/clash/asterixStorage/ >>>>>>>>>>> asterixdb5/red4/txnlog", >>>>>>>>>>> "red5": "/home/clash/asterixStorage/ >>>>>>>>>>> asterixdb5/red5/txnlog", >>>>>>>>>>> "red6": "/home/clash/asterixStorage/ >>>>>>>>>>> asterixdb5/red6/txnlog", >>>>>>>>>>> "red7": "/home/clash/asterixStorage/ >>>>>>>>>>> asterixdb5/red7/txnlog", >>>>>>>>>>> "red8": "/home/clash/asterixStorage/ >>>>>>>>>>> asterixdb5/red8/txnlog", >>>>>>>>>>> "red9": "/home/clash/asterixStorage/ >>>>>>>>>>> asterixdb5/red9/txnlog" >>>>>>>>>>> }, >>>>>>>>>>> >>>>>>>>>>> "txn.commitprofiler.reportinterval": 5, >>>>>>>>>>>> "txn.job.recovery.memorysize": 67108864, >>>>>>>>>>>> "txn.lock.escalationthreshold": 1000, >>>>>>>>>>>> "txn.lock.shrinktimer": 5000, >>>>>>>>>>>> "txn.lock.timeout.sweepthreshold": 10000, >>>>>>>>>>>> "txn.lock.timeout.waitthreshold": 60000, >>>>>>>>>>>> "txn.log.buffer.numpages": 8, >>>>>>>>>>>> "txn.log.buffer.pagesize": 131072, >>>>>>>>>>>> "txn.log.checkpoint.history": 0, >>>>>>>>>>>> "txn.log.checkpoint.lsnthreshold": 67108864, >>>>>>>>>>>> "txn.log.checkpoint.pollfrequency": 120, >>>>>>>>>>>> "txn.log.partitionsize": 268435456, >>>>>>>>>>>> "web.port": 19001, >>>>>>>>>>>> "web.queryinterface.port": 19006, >>>>>>>>>>>> "web.secondary.port": 19005 >>>>>>>>>>>> }, >>>>>>>>>>>> "fullShutdownUri": >>>>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/shutdown?all=true", >>>>>>>>>>>> "metadata_node": "red16", >>>>>>>>>>>> "ncs": [ >>>>>>>>>>>> { >>>>>>>>>>>> "configUri": >>>>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red15/ >>>>>>>>>>>> config >>>>>>>>>>>> ", >>>>>>>>>>>> "node_id": "red15", >>>>>>>>>>>> "partitions": [{ >>>>>>>>>>>> "active": true, >>>>>>>>>>>> "partition_id": "partition_1" >>>>>>>>>>>> }], >>>>>>>>>>>> "state": "ACTIVE", >>>>>>>>>>>> "statsUri": >>>>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red15/stats >>>>>>>>>>>> ", >>>>>>>>>>>> "threadDumpUri": >>>>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/node/ >>>>>>>>>>>> >>>>>>>>>>>> red15/threaddump >>>>>>>>>>>> >>>>>>>>>>> " >>>>>>>>>> >>>>>>>>> }, >>>>>>>>> >>>>>>>>>> { >>>>>>>>>>> >>>>>>>>>>>> "configUri": >>>>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red14/ >>>>>>>>>>>> config >>>>>>>>>>>> ", >>>>>>>>>>>> "node_id": "red14", >>>>>>>>>>>> "partitions": [{ >>>>>>>>>>>> "active": true, >>>>>>>>>>>> "partition_id": "partition_2" >>>>>>>>>>>> }], >>>>>>>>>>>> "state": "ACTIVE", >>>>>>>>>>>> "statsUri": >>>>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red14/stats >>>>>>>>>>>> ", >>>>>>>>>>>> "threadDumpUri": >>>>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/node/ >>>>>>>>>>>> >>>>>>>>>>>> red14/threaddump >>>>>>>>>>>> >>>>>>>>>>> " >>>>>>>>>> >>>>>>>>> }, >>>>>>>>> >>>>>>>>>> { >>>>>>>>>>> >>>>>>>>>>>> "configUri": >>>>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red16/ >>>>>>>>>>>> config >>>>>>>>>>>> ", >>>>>>>>>>>> "node_id": "red16", >>>>>>>>>>>> "partitions": [{ >>>>>>>>>>>> "active": true, >>>>>>>>>>>> "partition_id": "partition_0" >>>>>>>>>>>> }], >>>>>>>>>>>> "state": "ACTIVE", >>>>>>>>>>>> "statsUri": >>>>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red16/stats >>>>>>>>>>>> ", >>>>>>>>>>>> "threadDumpUri": >>>>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/node/ >>>>>>>>>>>> >>>>>>>>>>>> red16/threaddump >>>>>>>>>>>> >>>>>>>>>>> " >>>>>>>>>> >>>>>>>>> }, >>>>>>>>> >>>>>>>>>> { >>>>>>>>>>> >>>>>>>>>>>> "configUri": >>>>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red11/ >>>>>>>>>>>> config >>>>>>>>>>>> ", >>>>>>>>>>>> "node_id": "red11", >>>>>>>>>>>> "partitions": [{ >>>>>>>>>>>> "active": true, >>>>>>>>>>>> "partition_id": "partition_5" >>>>>>>>>>>> }], >>>>>>>>>>>> "state": "ACTIVE", >>>>>>>>>>>> "statsUri": >>>>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red11/stats >>>>>>>>>>>> ", >>>>>>>>>>>> "threadDumpUri": >>>>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/node/ >>>>>>>>>>>> >>>>>>>>>>>> red11/threaddump >>>>>>>>>>>> >>>>>>>>>>> " >>>>>>>>>> >>>>>>>>> }, >>>>>>>>> >>>>>>>>>> { >>>>>>>>>>> >>>>>>>>>>>> "configUri": >>>>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red10/ >>>>>>>>>>>> config >>>>>>>>>>>> ", >>>>>>>>>>>> "node_id": "red10", >>>>>>>>>>>> "partitions": [{ >>>>>>>>>>>> "active": true, >>>>>>>>>>>> "partition_id": "partition_6" >>>>>>>>>>>> }], >>>>>>>>>>>> "state": "ACTIVE", >>>>>>>>>>>> "statsUri": >>>>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red10/stats >>>>>>>>>>>> ", >>>>>>>>>>>> "threadDumpUri": >>>>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/node/ >>>>>>>>>>>> >>>>>>>>>>>> red10/threaddump >>>>>>>>>>>> >>>>>>>>>>> " >>>>>>>>>> >>>>>>>>> }, >>>>>>>>> >>>>>>>>>> { >>>>>>>>>>> >>>>>>>>>>>> "configUri": >>>>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red13/ >>>>>>>>>>>> config >>>>>>>>>>>> ", >>>>>>>>>>>> "node_id": "red13", >>>>>>>>>>>> "partitions": [{ >>>>>>>>>>>> "active": true, >>>>>>>>>>>> "partition_id": "partition_3" >>>>>>>>>>>> }], >>>>>>>>>>>> "state": "ACTIVE", >>>>>>>>>>>> "statsUri": >>>>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red13/stats >>>>>>>>>>>> ", >>>>>>>>>>>> "threadDumpUri": >>>>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/node/ >>>>>>>>>>>> >>>>>>>>>>>> red13/threaddump >>>>>>>>>>>> >>>>>>>>>>> " >>>>>>>>>> >>>>>>>>> }, >>>>>>>>> >>>>>>>>>> { >>>>>>>>>>> >>>>>>>>>>>> "configUri": >>>>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red12/ >>>>>>>>>>>> config >>>>>>>>>>>> ", >>>>>>>>>>>> "node_id": "red12", >>>>>>>>>>>> "partitions": [{ >>>>>>>>>>>> "active": true, >>>>>>>>>>>> "partition_id": "partition_4" >>>>>>>>>>>> }], >>>>>>>>>>>> "state": "ACTIVE", >>>>>>>>>>>> "statsUri": >>>>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red12/stats >>>>>>>>>>>> ", >>>>>>>>>>>> "threadDumpUri": >>>>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/node/ >>>>>>>>>>>> >>>>>>>>>>>> red12/threaddump >>>>>>>>>>>> >>>>>>>>>>> " >>>>>>>>>> >>>>>>>>> }, >>>>>>>>> >>>>>>>>>> { >>>>>>>>>>> >>>>>>>>>>>> "configUri": >>>>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red6/config >>>>>>>>>>>> ", >>>>>>>>>>>> "node_id": "red6", >>>>>>>>>>>> "partitions": [{ >>>>>>>>>>>> "active": true, >>>>>>>>>>>> "partition_id": "partition_10" >>>>>>>>>>>> }], >>>>>>>>>>>> "state": "ACTIVE", >>>>>>>>>>>> "statsUri": >>>>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red6/stats >>>>>>>>>>>> ", >>>>>>>>>>>> "threadDumpUri": >>>>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red6/ >>>>>>>>>>>> >>>>>>>>>>>> threaddump" >>>>>>>>>>>> >>>>>>>>>>> }, >>>>>>>>>> { >>>>>>>>>> >>>>>>>>>>> "configUri": >>>>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red/config >>>>>>>>>>>> ", >>>>>>>>>>>> "node_id": "red", >>>>>>>>>>>> "partitions": [{ >>>>>>>>>>>> "active": true, >>>>>>>>>>>> "partition_id": "partition_15" >>>>>>>>>>>> }], >>>>>>>>>>>> "state": "ACTIVE", >>>>>>>>>>>> "statsUri": >>>>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red/stats", >>>>>>>>>>>> "threadDumpUri": >>>>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red/thre >>>>>>>>>>>> addump >>>>>>>>>>>> >>>>>>>>>>>> " >>>>>>>>>>>> >>>>>>>>>>> }, >>>>>>>>>> { >>>>>>>>>> >>>>>>>>>>> "configUri": >>>>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red5/config >>>>>>>>>>>> ", >>>>>>>>>>>> "node_id": "red5", >>>>>>>>>>>> "partitions": [{ >>>>>>>>>>>> "active": true, >>>>>>>>>>>> "partition_id": "partition_11" >>>>>>>>>>>> }], >>>>>>>>>>>> "state": "ACTIVE", >>>>>>>>>>>> "statsUri": >>>>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red5/stats >>>>>>>>>>>> ", >>>>>>>>>>>> "threadDumpUri": >>>>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red5/ >>>>>>>>>>>> >>>>>>>>>>>> threaddump" >>>>>>>>>>>> >>>>>>>>>>> }, >>>>>>>>>> { >>>>>>>>>> >>>>>>>>>>> "configUri": >>>>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red8/config >>>>>>>>>>>> ", >>>>>>>>>>>> "node_id": "red8", >>>>>>>>>>>> "partitions": [{ >>>>>>>>>>>> "active": true, >>>>>>>>>>>> "partition_id": "partition_8" >>>>>>>>>>>> }], >>>>>>>>>>>> "state": "ACTIVE", >>>>>>>>>>>> "statsUri": >>>>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red8/stats >>>>>>>>>>>> ", >>>>>>>>>>>> "threadDumpUri": >>>>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red8/ >>>>>>>>>>>> >>>>>>>>>>>> threaddump" >>>>>>>>>>>> >>>>>>>>>>> }, >>>>>>>>>> { >>>>>>>>>> >>>>>>>>>>> "configUri": >>>>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red7/config >>>>>>>>>>>> ", >>>>>>>>>>>> "node_id": "red7", >>>>>>>>>>>> "partitions": [{ >>>>>>>>>>>> "active": true, >>>>>>>>>>>> "partition_id": "partition_9" >>>>>>>>>>>> }], >>>>>>>>>>>> "state": "ACTIVE", >>>>>>>>>>>> "statsUri": >>>>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red7/stats >>>>>>>>>>>> ", >>>>>>>>>>>> "threadDumpUri": >>>>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red7/ >>>>>>>>>>>> >>>>>>>>>>>> threaddump" >>>>>>>>>>>> >>>>>>>>>>> }, >>>>>>>>>> { >>>>>>>>>> >>>>>>>>>>> "configUri": >>>>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red2/config >>>>>>>>>>>> ", >>>>>>>>>>>> "node_id": "red2", >>>>>>>>>>>> "partitions": [{ >>>>>>>>>>>> "active": true, >>>>>>>>>>>> "partition_id": "partition_14" >>>>>>>>>>>> }], >>>>>>>>>>>> "state": "ACTIVE", >>>>>>>>>>>> "statsUri": >>>>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red2/stats >>>>>>>>>>>> ", >>>>>>>>>>>> "threadDumpUri": >>>>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red2/ >>>>>>>>>>>> >>>>>>>>>>>> threaddump" >>>>>>>>>>>> >>>>>>>>>>> }, >>>>>>>>>> { >>>>>>>>>> >>>>>>>>>>> "configUri": >>>>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red4/config >>>>>>>>>>>> ", >>>>>>>>>>>> "node_id": "red4", >>>>>>>>>>>> "partitions": [{ >>>>>>>>>>>> "active": true, >>>>>>>>>>>> "partition_id": "partition_12" >>>>>>>>>>>> }], >>>>>>>>>>>> "state": "ACTIVE", >>>>>>>>>>>> "statsUri": >>>>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red4/stats >>>>>>>>>>>> ", >>>>>>>>>>>> "threadDumpUri": >>>>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red4/ >>>>>>>>>>>> >>>>>>>>>>>> threaddump" >>>>>>>>>>>> >>>>>>>>>>> }, >>>>>>>>>> { >>>>>>>>>> >>>>>>>>>>> "configUri": >>>>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red3/config >>>>>>>>>>>> ", >>>>>>>>>>>> "node_id": "red3", >>>>>>>>>>>> "partitions": [{ >>>>>>>>>>>> "active": true, >>>>>>>>>>>> "partition_id": "partition_13" >>>>>>>>>>>> }], >>>>>>>>>>>> "state": "ACTIVE", >>>>>>>>>>>> "statsUri": >>>>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red3/stats >>>>>>>>>>>> ", >>>>>>>>>>>> "threadDumpUri": >>>>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red3/ >>>>>>>>>>>> >>>>>>>>>>>> threaddump" >>>>>>>>>>>> >>>>>>>>>>> }, >>>>>>>>>> { >>>>>>>>>> >>>>>>>>>>> "configUri": >>>>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red9/config >>>>>>>>>>>> ", >>>>>>>>>>>> "node_id": "red9", >>>>>>>>>>>> "partitions": [{ >>>>>>>>>>>> "active": true, >>>>>>>>>>>> "partition_id": "partition_7" >>>>>>>>>>>> }], >>>>>>>>>>>> "state": "ACTIVE", >>>>>>>>>>>> "statsUri": >>>>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red9/stats >>>>>>>>>>>> ", >>>>>>>>>>>> "threadDumpUri": >>>>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red9/ >>>>>>>>>>>> >>>>>>>>>>>> threaddump" >>>>>>>>>>>> >>>>>>>>>>> } >>>>>>>>>> ], >>>>>>>>>> >>>>>>>>>>> "shutdownUri": "http://scai01.cs.ucla.edu:190 >>>>>>>>>>>> 02/admin/shutdown >>>>>>>>>>>> >>>>>>>>>>>> ", >>>>>>>>>>>> >>>>>>>>>>> "state": "ACTIVE", >>>>>>>>>> "versionUri": "http://scai01.cs.ucla.edu:190 >>>>>>>>>> 02/admin/version" >>>>>>>>>> >>>>>>>>>>> } >>>>>>>>>>>> >>>>>>>>>>>> 2.Catalog_return:2.28G >>>>>>>>>>>> >>>>>>>>>>>> catalog_sales:31.01G >>>>>>>>>>>> >>>>>>>>>>>> inventory:8.63G >>>>>>>>>>>> >>>>>>>>>>>> 3.As for Pig and Hive, I always use the default configuration. I >>>>>>>>>>>> >>>>>>>>>>>> didn't >>>>>>>>>>>> >>>>>>>>>>> set >>>>>>>>>> >>>>>>>>>> the partition things for them. And for Spark, we use 200 >>>>>>>>>>> >>>>>>>>>>>> partitions, >>>>>>>>>>>> >>>>>>>>>>> which >>>>>>>>>> may be improved and just not bad. For AsterixDB, I also set the >>>>>>>>>> >>>>>>>>>>> cluster >>>>>>>>>>>> >>>>>>>>>>> using default value of partition and JVM things (I didn't >>>>>>>>>> manually >>>>>>>>>> >>>>>>>>>> set >>>>>>>>>>> >>>>>>>>>>> these parameters). >>>>>>>>>> >>>>>>>>>> On Tue, Dec 20, 2016 at 5:58 PM, Yingyi Bu <[email protected]> >>>>>>>>>>>> >>>>>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>> Mingda, >>>>>>>>>> >>>>>>>>>> 1. Can you paste the returned JSON of http://<master >>>>>>>>>>> >>>>>>>>>>>> node>:19002/admin/cluster at your side? (Pls replace <master >>>>>>>>>>>>> >>>>>>>>>>>>> node> >>>>>>>>>>>>> >>>>>>>>>>>> with >>>>>>>>>>> >>>>>>>>>> the >>>>>>>>>> >>>>>>>>>>> actual master node name or IP) >>>>>>>>>>>> >>>>>>>>>>>>> 2. Can you list the individual size of each dataset >>>>>>>>>>>>> involved >>>>>>>>>>>>> >>>>>>>>>>>>> in >>>>>>>>>>>>> >>>>>>>>>>>> the >>>>>>>>>>> query, e.g., catalog_returns, catalog_sales, and inventory? (I >>>>>>>>>>> >>>>>>>>>>>> assume >>>>>>>>>>>> >>>>>>>>>>>> 100GB is the overall size?) >>>>>>>>>>> >>>>>>>>>>> 3. Do Spark/Hive/Pig saturate all CPUs on all machines, >>>>>>>>>>>> >>>>>>>>>>>>> i.e., >>>>>>>>>>>>> >>>>>>>>>>>> how >>>>>>>>>>> >>>>>>>>>> many >>>>>>>>>> >>>>>>>>>>> partitions are running on each machine? (It seems that your >>>>>>>>>>>> >>>>>>>>>>>>> AsterixDB >>>>>>>>>>>>> >>>>>>>>>>>> configuration wouldn't saturate all CPUs for queries --- in the >>>>>>>>>>> >>>>>>>>>>> current >>>>>>>>>>>> >>>>>>>>>>>> AsterixDB master, the computation parallelism is set to be the >>>>>>>>>>> >>>>>>>>>>> same >>>>>>>>>>>> >>>>>>>>>>>> as >>>>>>>>>>> >>>>>>>>>> the >>>>>>>>>> >>>>>>>>>>> storage parallelism (i.e., the number of iodevices on each NC). >>>>>>>>>>>> >>>>>>>>>>>>> I've >>>>>>>>>>>>> >>>>>>>>>>>> submitted a new patch that allow flexible computation >>>>>>>>>>> parallelism, >>>>>>>>>>> which >>>>>>>>>>> >>>>>>>>>> should be able to get merged into master very soon.) >>>>>>>>>> >>>>>>>>>>> Thanks! >>>>>>>>>>>>> >>>>>>>>>>>>> Best, >>>>>>>>>>>>> Yingyi >>>>>>>>>>>>> >>>>>>>>>>>>> On Tue, Dec 20, 2016 at 5:44 PM, mingda li < >>>>>>>>>>>>> >>>>>>>>>>>>> [email protected] >>>>>>>>>>>>> >>>>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>> Oh, sure. When we test the 100G multiple join, we find >>>>>>>>>> >>>>>>>>>>> AsterixDB >>>>>>>>>>>>> >>>>>>>>>>>>> is >>>>>>>>>>>> >>>>>>>>>>> slower >>>>>>>>> >>>>>>>>>> than Spark (but still faster than Pig and Hive). >>>>>>>>>>> >>>>>>>>>>>> I can share with you the both plots: 1-10G.eps and 1-100G.eps. >>>>>>>>>>>>>> >>>>>>>>>>>>>> (We >>>>>>>>>>>>>> >>>>>>>>>>>>> will >>>>>>>>>>>> >>>>>>>>>>> only use 1-10G.eps in our paper). >>>>>>>>>>> >>>>>>>>>>>> And thanks for Ian's advice:* The dev list generally strips >>>>>>>>>>>>> >>>>>>>>>>>>>> attachments. >>>>>>>>>>>>>> >>>>>>>>>>>>> Maybe you can just put the config inline? Or link to a >>>>>>>>>>>>> pastebin/gist?* >>>>>>>>>>>>> >>>>>>>>>>>>> I know why you can't see the attachments. So I move the plots >>>>>>>>>>>> >>>>>>>>>>>> with >>>>>>>>>>>>> >>>>>>>>>>>>> two >>>>>>>>>>>> >>>>>>>>>>> documents to my Dropbox. >>>>>>>>>>> >>>>>>>>>>>> You can find the >>>>>>>>>>>>> >>>>>>>>>>>>>> 1-10G.eps here: https://www.dropbox.com/s/ >>>>>>>>>>>>>> >>>>>>>>>>>>>> rk3xg6gigsfcuyq/1-10G.eps?dl=0 >>>>>>>>>>>>>> >>>>>>>>>>>>> 1-100G.eps here:https://www.dropbox.com/ >>>>>>>>>>>>> s/tyxnmt6ehau2ski/1-100G.eps >>>>>>>>>>>>> >>>>>>>>>>>>> ? >>>>>>>>>>>> >>>>>>>>>>> dl=0 >>>>>>>>>>> >>>>>>>>>>>> cc_conf.pdf here: https://www.dropbox.com/s/ >>>>>>>>>>>>> >>>>>>>>>>>>>> y3of1s17qdstv5f/cc_conf.pdf? >>>>>>>>>>>>>> >>>>>>>>>>>>> dl=0 >>>>>>>>>>>>> >>>>>>>>>>>>> CompleteQuery.pdf here: >>>>>>>>>>>>>> https://www.dropbox.com/s/lml3fzxfjcmf2c1/CompleteQuery. >>>>>>>>>>>>>> >>>>>>>>>>>>>> pdf?dl=0 >>>>>>>>>>>>>> >>>>>>>>>>>>> On Tue, Dec 20, 2016 at 4:40 PM, Tyson Condie < >>>>>>>>>>>> >>>>>>>>>>> [email protected]> >>>>>>>>>> >>>>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>> Mingda: Please also share the numbers for 100GB, which show >>>>>>>>>>>>> >>>>>>>>>>>>>> AsterixDB >>>>>>>>>>>>>> >>>>>>>>>>>>>> not >>>>>>>>>>>>> quite doing as well as Spark. These 100GB results will not be >>>>>>>>>>>>> >>>>>>>>>>>>>> in >>>>>>>>>>>>>> >>>>>>>>>>>>>> our >>>>>>>>>>>>> >>>>>>>>>>>> submission version, since they’re not needed for the desired >>>>>>>>>>> >>>>>>>>>>>> message: >>>>>>>>>>>>> picking the right join order matters. Nevertheless, I’d like >>>>>>>>>>>>> to >>>>>>>>>>>>> get a >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>> >
