Yes, that would work. However, if the cluster only has 128 available cores, it will use 128-way (instead of 1000) parallelism for the computation.
Best, Yingyi On Thu, Dec 22, 2016 at 12:40 PM, Mike Carey <[email protected]> wrote: > So to use maxi-storage parallelism for scans but maxi-core parallelism > subsequently, one would just specify a large positive value >= the number > of available cores? (E.g., 10000) > > > > On 12/22/16 11:37 AM, Yingyi Bu wrote: > >> No need to reload data anymore :-) >> >> Best, >> Yingyi >> >> On Thu, Dec 22, 2016 at 11:36 AM, Yingyi Bu <[email protected]> wrote: >> >> Indeed, the change was merged yesterday. >>> If you grab the latest master, the computation parallelism can be set by >>> the parameter compiler.parallelism: >>> -- 0, the default, means to use the storage parallelism as computation >>> parallelism >>> -- negative value, means to use all available cores in the cluster >>> -- positive value, means to that number of cores, but it will fall to >>> back >>> to use all available cores if the number is too large. >>> >>> Best, >>> Yingyi >>> >>> >>> On Thu, Dec 22, 2016 at 10:59 AM, Mike Carey <[email protected]> wrote: >>> >>> It would definitely also be fun to see how the newest more flexible >>>> parallelism stuff handles this! (Where you'd specify storage >>>> parallelism >>>> based on drives, and compute parallelism based on cores, both spread >>>> across >>>> all of the cluster's resources.) >>>> >>>> >>>> On 12/22/16 10:57 AM, Yingyi Bu wrote: >>>> >>>> Mingda, >>>>> >>>>> >>>>> There is one more things that you can do without the need to >>>>> reload >>>>> your data: >>>>> >>>>> -- Enlarge the buffercache memory budget to 8GB so that the >>>>> datasets >>>>> can fit into the buffer cache: >>>>> "storage.buffercache.size": 536870912 >>>>> -> >>>>> "storage.buffercache.size": 8589934592 >>>>> >>>>> You don't need to reload data but only need to restart the >>>>> AsterixDB >>>>> instance. >>>>> Thanks! >>>>> >>>>> Best, >>>>> Yingyi >>>>> >>>>> >>>>> >>>>> On Wed, Dec 21, 2016 at 9:22 PM, Mike Carey <[email protected]> wrote: >>>>> >>>>> Nice!! >>>>> >>>>>> On Dec 21, 2016 8:43 PM, "Yingyi Bu" <[email protected]> wrote: >>>>>> >>>>>> Cool, thanks, Mingda! >>>>>> >>>>>>> Look forward to the new numbers! >>>>>>> >>>>>>> Best, >>>>>>> Yingyi >>>>>>> >>>>>>> On Wed, Dec 21, 2016 at 7:13 PM, mingda li <[email protected]> >>>>>>> >>>>>>> wrote: >>>>>> >>>>>> Dear Yingyi, >>>>>>> >>>>>>>> Thanks for your suggestion. I have reset the AsterixDB and retest >>>>>>>> >>>>>>>> AsterixDB >>>>>>> >>>>>>> using new environment on 100G, 10G. The efficiency for all the tests >>>>>>>> >>>>>>>> (good >>>>>>> >>>>>>> and bad order) have been all improved to twice speed. >>>>>>>> I will finish all the tests and update the result later. >>>>>>>> >>>>>>>> Bests, >>>>>>>> Mingda >>>>>>>> >>>>>>>> On Tue, Dec 20, 2016 at 10:20 PM, Yingyi Bu <[email protected]> >>>>>>>> >>>>>>>> wrote: >>>>>>> Hi Mingda, >>>>>>> >>>>>>>> I think that in your setting, a better configuration for >>>>>>>>> >>>>>>>>> AsterixDB >>>>>>>> >>>>>>> might be to use 64 partitions, i.e., 4 cores *16. >>>>>>> >>>>>>>> 1. To achieve that, you have to have 4 iodevices on each NC, >>>>>>>>> >>>>>>>>> e.g.: >>>>>>>> >>>>>>> iodevices=/home/clash/asterixStorage/asterixdb5/red16 >>>>>>> >>>>>>>> --> >>>>>>>>> iodevices=/home/clash/asterix >>>>>>>>> Storage/asterixdb5/red16-1, >>>>>>>>> iodevices=/home/clash/asterixStorage/asterixdb5/red16-2, >>>>>>>>> iodevices=/home/clash/asterixStorage/asterixdb5/red16-3, >>>>>>>>> iodevices=/home/clash/asterixStorage/asterixdb5/red16-4 >>>>>>>>> >>>>>>>>> 2. Assume you have 64 partitions, 31.01G/64 ~= 0.5G. That >>>>>>>>> means >>>>>>>>> >>>>>>>>> you'd >>>>>>>> >>>>>>>> better have 512MB memory budget for each joiner so as to make the >>>>>>>>> >>>>>>>>> join >>>>>>>> >>>>>>> memory-resident. >>>>>>> >>>>>>>> To achieve that, in the cc section in the cc.conf, you >>>>>>>>> >>>>>>>>> could >>>>>>>> >>>>>>> add: >>>>>>> >>>>>>>> compiler.joinmemory=536870912 >>>>>>>>> >>>>>>>>> 3. For the JVM setting, 1024MB is too small for the NC. >>>>>>>>> In the shared NC section in cc.conf, you can add: >>>>>>>>> jvm.args=-Xmx16G >>>>>>>>> >>>>>>>>> 4. For Pig and Hive, you can set the maximum mapper/reducer >>>>>>>>> >>>>>>>>> numbers >>>>>>>> in >>>>>>>> >>>>>>>> the MapReduce configuration, e.g., at most 4 mappers per machine and >>>>>>>>> >>>>>>>>> at >>>>>>>> >>>>>>> most 4 reducers per machine. >>>>>>> >>>>>>>> 5. I'm not super-familiar with hyper threading, but it might >>>>>>>>> be >>>>>>>>> >>>>>>>>> worth >>>>>>>> >>>>>>>> trying 8 partitions per machine, i.e., 128 partitions in total. >>>>>>>>> >>>>>>>>> To validate if the new settings work, you can go to the >>>>>>>>> >>>>>>>>> admin/cluster >>>>>>>> >>>>>>>> page to double check. >>>>>>>>> Pls keep us updated and let us know if you run into any >>>>>>>>> issue. >>>>>>>>> Thanks! >>>>>>>>> >>>>>>>>> Best, >>>>>>>>> Yingyi >>>>>>>>> >>>>>>>>> >>>>>>>>> On Tue, Dec 20, 2016 at 9:26 PM, mingda li <[email protected] >>>>>>>>> > >>>>>>>>> >>>>>>>>> wrote: >>>>>>>> >>>>>>>> Dear Yingyi, >>>>>>>>> >>>>>>>>>> 1. For the returned of :http://<master node>:19002/admin/cluster >>>>>>>>>> >>>>>>>>>> { >>>>>>>>>> "cc": { >>>>>>>>>> "configUri": "http://scai01.cs.ucla.edu: >>>>>>>>>> 19002/admin/cluster/cc/config", >>>>>>>>>> "statsUri": "http://scai01.cs.ucla.edu: >>>>>>>>>> 19002/admin/cluster/cc/stats", >>>>>>>>>> "threadDumpUri": >>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/cc/threaddump" >>>>>>>>>> }, >>>>>>>>>> "config": { >>>>>>>>>> "api.port": 19002, >>>>>>>>>> "cc.java.opts": "-Xmx1024m", >>>>>>>>>> "cluster.partitions": { >>>>>>>>>> "0": "ID:0, Original Node: red16, IODevice: 0, >>>>>>>>>> Active >>>>>>>>>> >>>>>>>>>> Node: >>>>>>>>> >>>>>>>> red16", >>>>>>>> >>>>>>>>> "1": "ID:1, Original Node: red15, IODevice: 0, Active >>>>>>>>>> >>>>>>>>>> Node: >>>>>>>>> >>>>>>>> red15", >>>>>>>> >>>>>>>>> "2": "ID:2, Original Node: red14, IODevice: 0, Active >>>>>>>>>> >>>>>>>>>> Node: >>>>>>>>> >>>>>>>> red14", >>>>>>>> >>>>>>>>> "3": "ID:3, Original Node: red13, IODevice: 0, Active >>>>>>>>>> >>>>>>>>>> Node: >>>>>>>>> >>>>>>>> red13", >>>>>>>> >>>>>>>>> "4": "ID:4, Original Node: red12, IODevice: 0, Active >>>>>>>>>> >>>>>>>>>> Node: >>>>>>>>> >>>>>>>> red12", >>>>>>>> >>>>>>>>> "5": "ID:5, Original Node: red11, IODevice: 0, Active >>>>>>>>>> >>>>>>>>>> Node: >>>>>>>>> >>>>>>>> red11", >>>>>>>> >>>>>>>>> "6": "ID:6, Original Node: red10, IODevice: 0, Active >>>>>>>>>> >>>>>>>>>> Node: >>>>>>>>> >>>>>>>> red10", >>>>>>>> >>>>>>>>> "7": "ID:7, Original Node: red9, IODevice: 0, Active >>>>>>>>>> >>>>>>>>>> Node: >>>>>>>>> >>>>>>>> red9", >>>>>>>> >>>>>>>>> "8": "ID:8, Original Node: red8, IODevice: 0, Active >>>>>>>>>> >>>>>>>>>> Node: >>>>>>>>> >>>>>>>> red8", >>>>>>>> >>>>>>>>> "9": "ID:9, Original Node: red7, IODevice: 0, Active >>>>>>>>>> >>>>>>>>>> Node: >>>>>>>>> >>>>>>>> red7", >>>>>>>> >>>>>>>>> "10": "ID:10, Original Node: red6, IODevice: 0, >>>>>>>>>> Active >>>>>>>>>> >>>>>>>>>> Node: >>>>>>>>> red6", >>>>>>>>> >>>>>>>>>> "11": "ID:11, Original Node: red5, IODevice: 0, >>>>>>>>>> Active >>>>>>>>>> >>>>>>>>>> Node: >>>>>>>>> red5", >>>>>>>>> >>>>>>>>>> "12": "ID:12, Original Node: red4, IODevice: 0, >>>>>>>>>> Active >>>>>>>>>> >>>>>>>>>> Node: >>>>>>>>> red4", >>>>>>>>> >>>>>>>>>> "13": "ID:13, Original Node: red3, IODevice: 0, >>>>>>>>>> Active >>>>>>>>>> >>>>>>>>>> Node: >>>>>>>>> red3", >>>>>>>>> >>>>>>>>>> "14": "ID:14, Original Node: red2, IODevice: 0, >>>>>>>>>> Active >>>>>>>>>> >>>>>>>>>> Node: >>>>>>>>> red2", >>>>>>>>> >>>>>>>>>> "15": "ID:15, Original Node: red, IODevice: 0, >>>>>>>>>> Active >>>>>>>>>> >>>>>>>>>> Node: >>>>>>>>> >>>>>>>> red" >>>>>>>> >>>>>>>>> }, >>>>>>>>>> "compiler.framesize": 32768, >>>>>>>>>> "compiler.groupmemory": 33554432, >>>>>>>>>> "compiler.joinmemory": 33554432, >>>>>>>>>> "compiler.pregelix.home": "~/pregelix", >>>>>>>>>> "compiler.sortmemory": 33554432, >>>>>>>>>> "core.dump.paths": { >>>>>>>>>> "red": "/home/clash/asterixStorage/ >>>>>>>>>> >>>>>>>>>> asterixdb5/red/coredump", >>>>>>>>> "red10": "/home/clash/asterixStorage/ >>>>>>>>> >>>>>>>>>> asterixdb5/red10/coredump", >>>>>>>>>> "red11": "/home/clash/asterixStorage/ >>>>>>>>>> asterixdb5/red11/coredump", >>>>>>>>>> "red12": "/home/clash/asterixStorage/ >>>>>>>>>> asterixdb5/red12/coredump", >>>>>>>>>> "red13": "/home/clash/asterixStorage/ >>>>>>>>>> asterixdb5/red13/coredump", >>>>>>>>>> "red14": "/home/clash/asterixStorage/ >>>>>>>>>> asterixdb5/red14/coredump", >>>>>>>>>> "red15": "/home/clash/asterixStorage/ >>>>>>>>>> asterixdb5/red15/coredump", >>>>>>>>>> "red16": "/home/clash/asterixStorage/ >>>>>>>>>> asterixdb5/red16/coredump", >>>>>>>>>> "red2": "/home/clash/asterixStorage/ >>>>>>>>>> >>>>>>>>>> asterixdb5/red2/coredump", >>>>>>>>> >>>>>>>>> "red3": "/home/clash/asterixStorage/ >>>>>>>>>> >>>>>>>>>> asterixdb5/red3/coredump", >>>>>>>>> >>>>>>>>> "red4": "/home/clash/asterixStorage/ >>>>>>>>>> >>>>>>>>>> asterixdb5/red4/coredump", >>>>>>>>> >>>>>>>>> "red5": "/home/clash/asterixStorage/ >>>>>>>>>> >>>>>>>>>> asterixdb5/red5/coredump", >>>>>>>>> >>>>>>>>> "red6": "/home/clash/asterixStorage/ >>>>>>>>>> >>>>>>>>>> asterixdb5/red6/coredump", >>>>>>>>> >>>>>>>>> "red7": "/home/clash/asterixStorage/ >>>>>>>>>> >>>>>>>>>> asterixdb5/red7/coredump", >>>>>>>>> >>>>>>>>> "red8": "/home/clash/asterixStorage/ >>>>>>>>>> >>>>>>>>>> asterixdb5/red8/coredump", >>>>>>>>> >>>>>>>>> "red9": "/home/clash/asterixStorage/ >>>>>>>>>> >>>>>>>>>> asterixdb5/red9/coredump" >>>>>>>>> >>>>>>>>> }, >>>>>>>>>> "feed.central.manager.port": 4500, >>>>>>>>>> "feed.max.threshold.period": 5, >>>>>>>>>> "feed.memory.available.wait.timeout": 10, >>>>>>>>>> "feed.memory.global.budget": 67108864, >>>>>>>>>> "feed.pending.work.threshold": 50, >>>>>>>>>> "feed.port": 19003, >>>>>>>>>> "instance.name": "DEFAULT_INSTANCE", >>>>>>>>>> "log.level": "WARNING", >>>>>>>>>> "max.wait.active.cluster": 60, >>>>>>>>>> "metadata.callback.port": 0, >>>>>>>>>> "metadata.node": "red16", >>>>>>>>>> "metadata.partition": "ID:0, Original Node: red16, >>>>>>>>>> >>>>>>>>>> IODevice: >>>>>>>>> >>>>>>>> 0, Active Node: red16", >>>>>>> >>>>>>>> "metadata.port": 0, >>>>>>>>>> "metadata.registration.timeout.secs": 60, >>>>>>>>>> "nc.java.opts": "-Xmx1024m", >>>>>>>>>> "node.partitions": { >>>>>>>>>> "red": ["ID:15, Original Node: red, IODevice: 0, >>>>>>>>>> Active >>>>>>>>>> >>>>>>>>>> Node: >>>>>>>>> red"], >>>>>>>>> >>>>>>>>>> "red10": ["ID:6, Original Node: red10, IODevice: 0, >>>>>>>>>> >>>>>>>>>> Active >>>>>>>>> >>>>>>>> Node: red10"], >>>>>>>> >>>>>>>>> "red11": ["ID:5, Original Node: red11, IODevice: 0, >>>>>>>>>> >>>>>>>>>> Active >>>>>>>>> >>>>>>>> Node: red11"], >>>>>>>> >>>>>>>>> "red12": ["ID:4, Original Node: red12, IODevice: 0, >>>>>>>>>> >>>>>>>>>> Active >>>>>>>>> >>>>>>>> Node: red12"], >>>>>>>> >>>>>>>>> "red13": ["ID:3, Original Node: red13, IODevice: 0, >>>>>>>>>> >>>>>>>>>> Active >>>>>>>>> >>>>>>>> Node: red13"], >>>>>>>> >>>>>>>>> "red14": ["ID:2, Original Node: red14, IODevice: 0, >>>>>>>>>> >>>>>>>>>> Active >>>>>>>>> >>>>>>>> Node: red14"], >>>>>>>> >>>>>>>>> "red15": ["ID:1, Original Node: red15, IODevice: 0, >>>>>>>>>> >>>>>>>>>> Active >>>>>>>>> >>>>>>>> Node: red15"], >>>>>>>> >>>>>>>>> "red16": ["ID:0, Original Node: red16, IODevice: 0, >>>>>>>>>> >>>>>>>>>> Active >>>>>>>>> >>>>>>>> Node: red16"], >>>>>>>> >>>>>>>>> "red2": ["ID:14, Original Node: red2, IODevice: 0, >>>>>>>>>> >>>>>>>>>> Active >>>>>>>>> >>>>>>>> Node: red2"], >>>>>>> >>>>>>>> "red3": ["ID:13, Original Node: red3, IODevice: 0, >>>>>>>>>> >>>>>>>>>> Active >>>>>>>>> >>>>>>>> Node: red3"], >>>>>>> >>>>>>>> "red4": ["ID:12, Original Node: red4, IODevice: 0, >>>>>>>>>> >>>>>>>>>> Active >>>>>>>>> >>>>>>>> Node: red4"], >>>>>>> >>>>>>>> "red5": ["ID:11, Original Node: red5, IODevice: 0, >>>>>>>>>> >>>>>>>>>> Active >>>>>>>>> >>>>>>>> Node: red5"], >>>>>>> >>>>>>>> "red6": ["ID:10, Original Node: red6, IODevice: 0, >>>>>>>>>> >>>>>>>>>> Active >>>>>>>>> >>>>>>>> Node: red6"], >>>>>>> >>>>>>>> "red7": ["ID:9, Original Node: red7, IODevice: 0, >>>>>>>>>> >>>>>>>>>> Active >>>>>>>>> >>>>>>>> Node: red7"], >>>>>>> >>>>>>>> "red8": ["ID:8, Original Node: red8, IODevice: 0, >>>>>>>>>> >>>>>>>>>> Active >>>>>>>>> >>>>>>>> Node: red8"], >>>>>>> >>>>>>>> "red9": ["ID:7, Original Node: red9, IODevice: 0, >>>>>>>>>> >>>>>>>>>> Active >>>>>>>>> >>>>>>>> Node: red9"] >>>>>>> >>>>>>>> }, >>>>>>>>>> "node.stores": { >>>>>>>>>> "red": ["/home/clash/asterixStorage/ >>>>>>>>>> >>>>>>>>>> asterixdb5/red/storage"], >>>>>>>>> >>>>>>>>> "red10": ["/home/clash/asterixStorage/ >>>>>>>>>> asterixdb5/red10/storage"], >>>>>>>>>> "red11": ["/home/clash/asterixStorage/ >>>>>>>>>> asterixdb5/red11/storage"], >>>>>>>>>> "red12": ["/home/clash/asterixStorage/ >>>>>>>>>> asterixdb5/red12/storage"], >>>>>>>>>> "red13": ["/home/clash/asterixStorage/ >>>>>>>>>> asterixdb5/red13/storage"], >>>>>>>>>> "red14": ["/home/clash/asterixStorage/ >>>>>>>>>> asterixdb5/red14/storage"], >>>>>>>>>> "red15": ["/home/clash/asterixStorage/ >>>>>>>>>> asterixdb5/red15/storage"], >>>>>>>>>> "red16": ["/home/clash/asterixStorage/ >>>>>>>>>> asterixdb5/red16/storage"], >>>>>>>>>> "red2": ["/home/clash/asterixStorage/ >>>>>>>>>> asterixdb5/red2/storage"], >>>>>>>>>> "red3": ["/home/clash/asterixStorage/ >>>>>>>>>> asterixdb5/red3/storage"], >>>>>>>>>> "red4": ["/home/clash/asterixStorage/ >>>>>>>>>> asterixdb5/red4/storage"], >>>>>>>>>> "red5": ["/home/clash/asterixStorage/ >>>>>>>>>> asterixdb5/red5/storage"], >>>>>>>>>> "red6": ["/home/clash/asterixStorage/ >>>>>>>>>> asterixdb5/red6/storage"], >>>>>>>>>> "red7": ["/home/clash/asterixStorage/ >>>>>>>>>> asterixdb5/red7/storage"], >>>>>>>>>> "red8": ["/home/clash/asterixStorage/ >>>>>>>>>> asterixdb5/red8/storage"], >>>>>>>>>> "red9": ["/home/clash/asterixStorage/ >>>>>>>>>> >>>>>>>>>> asterixdb5/red9/storage"] >>>>>>>>> >>>>>>>>> }, >>>>>>>>>> "plot.activate": false, >>>>>>>>>> "replication.enabled": false, >>>>>>>>>> "replication.factor": 2, >>>>>>>>>> "replication.log.batchsize": 4096, >>>>>>>>>> "replication.log.buffer.numpages": 8, >>>>>>>>>> "replication.log.buffer.pagesize": 131072, >>>>>>>>>> "replication.max.remote.recovery.attempts": 5, >>>>>>>>>> "replication.timeout": 30, >>>>>>>>>> "storage.buffercache.maxopenfiles": 2147483647, >>>>>>>>>> "storage.buffercache.pagesize": 131072, >>>>>>>>>> "storage.buffercache.size": 536870912, >>>>>>>>>> "storage.lsm.bloomfilter.falsepositiverate": 0.01, >>>>>>>>>> "storage.memorycomponent.globalbudget": 536870912, >>>>>>>>>> "storage.memorycomponent.numcomponents": 2, >>>>>>>>>> "storage.memorycomponent.numpages": 256, >>>>>>>>>> "storage.memorycomponent.pagesize": 131072, >>>>>>>>>> "storage.metadata.memorycomponent.numpages": 256, >>>>>>>>>> "transaction.log.dirs": { >>>>>>>>>> "red": "/home/clash/asterixStorage/ >>>>>>>>>> >>>>>>>>>> asterixdb5/red/txnlog", >>>>>>>>> >>>>>>>> "red10": "/home/clash/asterixStorage/ >>>>>>>> >>>>>>>>> asterixdb5/red10/txnlog", >>>>>>>>> >>>>>>>>> "red11": "/home/clash/asterixStorage/ >>>>>>>>>> >>>>>>>>>> asterixdb5/red11/txnlog", >>>>>>>>> >>>>>>>>> "red12": "/home/clash/asterixStorage/ >>>>>>>>>> >>>>>>>>>> asterixdb5/red12/txnlog", >>>>>>>>> >>>>>>>>> "red13": "/home/clash/asterixStorage/ >>>>>>>>>> >>>>>>>>>> asterixdb5/red13/txnlog", >>>>>>>>> >>>>>>>>> "red14": "/home/clash/asterixStorage/ >>>>>>>>>> >>>>>>>>>> asterixdb5/red14/txnlog", >>>>>>>>> >>>>>>>>> "red15": "/home/clash/asterixStorage/ >>>>>>>>>> >>>>>>>>>> asterixdb5/red15/txnlog", >>>>>>>>> >>>>>>>>> "red16": "/home/clash/asterixStorage/ >>>>>>>>>> >>>>>>>>>> asterixdb5/red16/txnlog", >>>>>>>>> >>>>>>>>> "red2": "/home/clash/asterixStorage/ >>>>>>>>>> >>>>>>>>>> asterixdb5/red2/txnlog", >>>>>>>>> "red3": "/home/clash/asterixStorage/ >>>>>>>>> asterixdb5/red3/txnlog", >>>>>>>>> "red4": "/home/clash/asterixStorage/ >>>>>>>>> asterixdb5/red4/txnlog", >>>>>>>>> "red5": "/home/clash/asterixStorage/ >>>>>>>>> asterixdb5/red5/txnlog", >>>>>>>>> "red6": "/home/clash/asterixStorage/ >>>>>>>>> asterixdb5/red6/txnlog", >>>>>>>>> "red7": "/home/clash/asterixStorage/ >>>>>>>>> asterixdb5/red7/txnlog", >>>>>>>>> "red8": "/home/clash/asterixStorage/ >>>>>>>>> asterixdb5/red8/txnlog", >>>>>>>>> "red9": "/home/clash/asterixStorage/ >>>>>>>>> asterixdb5/red9/txnlog" >>>>>>>>> }, >>>>>>>>> >>>>>>>>>> "txn.commitprofiler.reportinterval": 5, >>>>>>>>>> "txn.job.recovery.memorysize": 67108864, >>>>>>>>>> "txn.lock.escalationthreshold": 1000, >>>>>>>>>> "txn.lock.shrinktimer": 5000, >>>>>>>>>> "txn.lock.timeout.sweepthreshold": 10000, >>>>>>>>>> "txn.lock.timeout.waitthreshold": 60000, >>>>>>>>>> "txn.log.buffer.numpages": 8, >>>>>>>>>> "txn.log.buffer.pagesize": 131072, >>>>>>>>>> "txn.log.checkpoint.history": 0, >>>>>>>>>> "txn.log.checkpoint.lsnthreshold": 67108864, >>>>>>>>>> "txn.log.checkpoint.pollfrequency": 120, >>>>>>>>>> "txn.log.partitionsize": 268435456, >>>>>>>>>> "web.port": 19001, >>>>>>>>>> "web.queryinterface.port": 19006, >>>>>>>>>> "web.secondary.port": 19005 >>>>>>>>>> }, >>>>>>>>>> "fullShutdownUri": >>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/shutdown?all=true", >>>>>>>>>> "metadata_node": "red16", >>>>>>>>>> "ncs": [ >>>>>>>>>> { >>>>>>>>>> "configUri": >>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red15/config >>>>>>>>>> ", >>>>>>>>>> "node_id": "red15", >>>>>>>>>> "partitions": [{ >>>>>>>>>> "active": true, >>>>>>>>>> "partition_id": "partition_1" >>>>>>>>>> }], >>>>>>>>>> "state": "ACTIVE", >>>>>>>>>> "statsUri": >>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red15/stats", >>>>>>>>>> "threadDumpUri": >>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/node/ >>>>>>>>>> >>>>>>>>>> red15/threaddump >>>>>>>>> >>>>>>>> " >>>>>>> >>>>>>> }, >>>>>>>> >>>>>>>>> { >>>>>>>>>> "configUri": >>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red14/config >>>>>>>>>> ", >>>>>>>>>> "node_id": "red14", >>>>>>>>>> "partitions": [{ >>>>>>>>>> "active": true, >>>>>>>>>> "partition_id": "partition_2" >>>>>>>>>> }], >>>>>>>>>> "state": "ACTIVE", >>>>>>>>>> "statsUri": >>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red14/stats", >>>>>>>>>> "threadDumpUri": >>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/node/ >>>>>>>>>> >>>>>>>>>> red14/threaddump >>>>>>>>> >>>>>>>> " >>>>>>> >>>>>>> }, >>>>>>>> >>>>>>>>> { >>>>>>>>>> "configUri": >>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red16/config >>>>>>>>>> ", >>>>>>>>>> "node_id": "red16", >>>>>>>>>> "partitions": [{ >>>>>>>>>> "active": true, >>>>>>>>>> "partition_id": "partition_0" >>>>>>>>>> }], >>>>>>>>>> "state": "ACTIVE", >>>>>>>>>> "statsUri": >>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red16/stats", >>>>>>>>>> "threadDumpUri": >>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/node/ >>>>>>>>>> >>>>>>>>>> red16/threaddump >>>>>>>>> >>>>>>>> " >>>>>>> >>>>>>> }, >>>>>>>> >>>>>>>>> { >>>>>>>>>> "configUri": >>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red11/config >>>>>>>>>> ", >>>>>>>>>> "node_id": "red11", >>>>>>>>>> "partitions": [{ >>>>>>>>>> "active": true, >>>>>>>>>> "partition_id": "partition_5" >>>>>>>>>> }], >>>>>>>>>> "state": "ACTIVE", >>>>>>>>>> "statsUri": >>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red11/stats", >>>>>>>>>> "threadDumpUri": >>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/node/ >>>>>>>>>> >>>>>>>>>> red11/threaddump >>>>>>>>> >>>>>>>> " >>>>>>> >>>>>>> }, >>>>>>>> >>>>>>>>> { >>>>>>>>>> "configUri": >>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red10/config >>>>>>>>>> ", >>>>>>>>>> "node_id": "red10", >>>>>>>>>> "partitions": [{ >>>>>>>>>> "active": true, >>>>>>>>>> "partition_id": "partition_6" >>>>>>>>>> }], >>>>>>>>>> "state": "ACTIVE", >>>>>>>>>> "statsUri": >>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red10/stats", >>>>>>>>>> "threadDumpUri": >>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/node/ >>>>>>>>>> >>>>>>>>>> red10/threaddump >>>>>>>>> >>>>>>>> " >>>>>>> >>>>>>> }, >>>>>>>> >>>>>>>>> { >>>>>>>>>> "configUri": >>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red13/config >>>>>>>>>> ", >>>>>>>>>> "node_id": "red13", >>>>>>>>>> "partitions": [{ >>>>>>>>>> "active": true, >>>>>>>>>> "partition_id": "partition_3" >>>>>>>>>> }], >>>>>>>>>> "state": "ACTIVE", >>>>>>>>>> "statsUri": >>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red13/stats", >>>>>>>>>> "threadDumpUri": >>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/node/ >>>>>>>>>> >>>>>>>>>> red13/threaddump >>>>>>>>> >>>>>>>> " >>>>>>> >>>>>>> }, >>>>>>>> >>>>>>>>> { >>>>>>>>>> "configUri": >>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red12/config >>>>>>>>>> ", >>>>>>>>>> "node_id": "red12", >>>>>>>>>> "partitions": [{ >>>>>>>>>> "active": true, >>>>>>>>>> "partition_id": "partition_4" >>>>>>>>>> }], >>>>>>>>>> "state": "ACTIVE", >>>>>>>>>> "statsUri": >>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red12/stats", >>>>>>>>>> "threadDumpUri": >>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/node/ >>>>>>>>>> >>>>>>>>>> red12/threaddump >>>>>>>>> >>>>>>>> " >>>>>>> >>>>>>> }, >>>>>>>> >>>>>>>>> { >>>>>>>>>> "configUri": >>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red6/config", >>>>>>>>>> "node_id": "red6", >>>>>>>>>> "partitions": [{ >>>>>>>>>> "active": true, >>>>>>>>>> "partition_id": "partition_10" >>>>>>>>>> }], >>>>>>>>>> "state": "ACTIVE", >>>>>>>>>> "statsUri": >>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red6/stats", >>>>>>>>>> "threadDumpUri": >>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red6/ >>>>>>>>>> >>>>>>>>>> threaddump" >>>>>>>>> >>>>>>>> }, >>>>>>> >>>>>>>> { >>>>>>>>>> "configUri": >>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red/config", >>>>>>>>>> "node_id": "red", >>>>>>>>>> "partitions": [{ >>>>>>>>>> "active": true, >>>>>>>>>> "partition_id": "partition_15" >>>>>>>>>> }], >>>>>>>>>> "state": "ACTIVE", >>>>>>>>>> "statsUri": >>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red/stats", >>>>>>>>>> "threadDumpUri": >>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red/thre >>>>>>>>>> addump >>>>>>>>>> >>>>>>>>>> " >>>>>>>>> >>>>>>>> }, >>>>>>> >>>>>>>> { >>>>>>>>>> "configUri": >>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red5/config", >>>>>>>>>> "node_id": "red5", >>>>>>>>>> "partitions": [{ >>>>>>>>>> "active": true, >>>>>>>>>> "partition_id": "partition_11" >>>>>>>>>> }], >>>>>>>>>> "state": "ACTIVE", >>>>>>>>>> "statsUri": >>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red5/stats", >>>>>>>>>> "threadDumpUri": >>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red5/ >>>>>>>>>> >>>>>>>>>> threaddump" >>>>>>>>> >>>>>>>> }, >>>>>>> >>>>>>>> { >>>>>>>>>> "configUri": >>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red8/config", >>>>>>>>>> "node_id": "red8", >>>>>>>>>> "partitions": [{ >>>>>>>>>> "active": true, >>>>>>>>>> "partition_id": "partition_8" >>>>>>>>>> }], >>>>>>>>>> "state": "ACTIVE", >>>>>>>>>> "statsUri": >>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red8/stats", >>>>>>>>>> "threadDumpUri": >>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red8/ >>>>>>>>>> >>>>>>>>>> threaddump" >>>>>>>>> >>>>>>>> }, >>>>>>> >>>>>>>> { >>>>>>>>>> "configUri": >>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red7/config", >>>>>>>>>> "node_id": "red7", >>>>>>>>>> "partitions": [{ >>>>>>>>>> "active": true, >>>>>>>>>> "partition_id": "partition_9" >>>>>>>>>> }], >>>>>>>>>> "state": "ACTIVE", >>>>>>>>>> "statsUri": >>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red7/stats", >>>>>>>>>> "threadDumpUri": >>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red7/ >>>>>>>>>> >>>>>>>>>> threaddump" >>>>>>>>> >>>>>>>> }, >>>>>>> >>>>>>>> { >>>>>>>>>> "configUri": >>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red2/config", >>>>>>>>>> "node_id": "red2", >>>>>>>>>> "partitions": [{ >>>>>>>>>> "active": true, >>>>>>>>>> "partition_id": "partition_14" >>>>>>>>>> }], >>>>>>>>>> "state": "ACTIVE", >>>>>>>>>> "statsUri": >>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red2/stats", >>>>>>>>>> "threadDumpUri": >>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red2/ >>>>>>>>>> >>>>>>>>>> threaddump" >>>>>>>>> >>>>>>>> }, >>>>>>> >>>>>>>> { >>>>>>>>>> "configUri": >>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red4/config", >>>>>>>>>> "node_id": "red4", >>>>>>>>>> "partitions": [{ >>>>>>>>>> "active": true, >>>>>>>>>> "partition_id": "partition_12" >>>>>>>>>> }], >>>>>>>>>> "state": "ACTIVE", >>>>>>>>>> "statsUri": >>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red4/stats", >>>>>>>>>> "threadDumpUri": >>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red4/ >>>>>>>>>> >>>>>>>>>> threaddump" >>>>>>>>> >>>>>>>> }, >>>>>>> >>>>>>>> { >>>>>>>>>> "configUri": >>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red3/config", >>>>>>>>>> "node_id": "red3", >>>>>>>>>> "partitions": [{ >>>>>>>>>> "active": true, >>>>>>>>>> "partition_id": "partition_13" >>>>>>>>>> }], >>>>>>>>>> "state": "ACTIVE", >>>>>>>>>> "statsUri": >>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red3/stats", >>>>>>>>>> "threadDumpUri": >>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red3/ >>>>>>>>>> >>>>>>>>>> threaddump" >>>>>>>>> >>>>>>>> }, >>>>>>> >>>>>>>> { >>>>>>>>>> "configUri": >>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red9/config", >>>>>>>>>> "node_id": "red9", >>>>>>>>>> "partitions": [{ >>>>>>>>>> "active": true, >>>>>>>>>> "partition_id": "partition_7" >>>>>>>>>> }], >>>>>>>>>> "state": "ACTIVE", >>>>>>>>>> "statsUri": >>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red9/stats", >>>>>>>>>> "threadDumpUri": >>>>>>>>>> "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red9/ >>>>>>>>>> >>>>>>>>>> threaddump" >>>>>>>>> >>>>>>>> } >>>>>>> >>>>>>>> ], >>>>>>>>>> "shutdownUri": "http://scai01.cs.ucla.edu:190 >>>>>>>>>> 02/admin/shutdown >>>>>>>>>> >>>>>>>>>> ", >>>>>>>>> >>>>>>>> "state": "ACTIVE", >>>>>>> >>>>>>>> "versionUri": "http://scai01.cs.ucla.edu:19002/admin/version" >>>>>>>>>> } >>>>>>>>>> >>>>>>>>>> 2.Catalog_return:2.28G >>>>>>>>>> >>>>>>>>>> catalog_sales:31.01G >>>>>>>>>> >>>>>>>>>> inventory:8.63G >>>>>>>>>> >>>>>>>>>> 3.As for Pig and Hive, I always use the default configuration. I >>>>>>>>>> >>>>>>>>>> didn't >>>>>>>>> >>>>>>>> set >>>>>>>> >>>>>>>>> the partition things for them. And for Spark, we use 200 >>>>>>>>>> >>>>>>>>>> partitions, >>>>>>>>> >>>>>>>> which >>>>>>> >>>>>>>> may be improved and just not bad. For AsterixDB, I also set the >>>>>>>>>> >>>>>>>>>> cluster >>>>>>>>> >>>>>>>> using default value of partition and JVM things (I didn't manually >>>>>>>> >>>>>>>>> set >>>>>>>>> >>>>>>>> these parameters). >>>>>>>> >>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Tue, Dec 20, 2016 at 5:58 PM, Yingyi Bu <[email protected]> >>>>>>>>>> >>>>>>>>>> wrote: >>>>>>>>> >>>>>>>> Mingda, >>>>>>>> >>>>>>>>> 1. Can you paste the returned JSON of http://<master >>>>>>>>>>> node>:19002/admin/cluster at your side? (Pls replace <master >>>>>>>>>>> >>>>>>>>>>> node> >>>>>>>>>> >>>>>>>>> with >>>>>>> >>>>>>>> the >>>>>>>>> >>>>>>>>>> actual master node name or IP) >>>>>>>>>>> 2. Can you list the individual size of each dataset >>>>>>>>>>> involved >>>>>>>>>>> >>>>>>>>>>> in >>>>>>>>>> >>>>>>>>> the >>>>>>>> >>>>>>>>> query, e.g., catalog_returns, catalog_sales, and inventory? (I >>>>>>>>>> assume >>>>>>>>>> >>>>>>>>> 100GB is the overall size?) >>>>>>>>> >>>>>>>>>> 3. Do Spark/Hive/Pig saturate all CPUs on all machines, >>>>>>>>>>> >>>>>>>>>>> i.e., >>>>>>>>>> >>>>>>>>> how >>>>>>> >>>>>>>> many >>>>>>>>> >>>>>>>>>> partitions are running on each machine? (It seems that your >>>>>>>>>>> >>>>>>>>>>> AsterixDB >>>>>>>>>> >>>>>>>>> configuration wouldn't saturate all CPUs for queries --- in the >>>>>>>>> >>>>>>>>>> current >>>>>>>>>> >>>>>>>>> AsterixDB master, the computation parallelism is set to be the >>>>>>>>> >>>>>>>>>> same >>>>>>>>>> >>>>>>>>> as >>>>>>> >>>>>>>> the >>>>>>>>> >>>>>>>>>> storage parallelism (i.e., the number of iodevices on each NC). >>>>>>>>>>> >>>>>>>>>>> I've >>>>>>>>>> >>>>>>>>> submitted a new patch that allow flexible computation >>>>>>>> >>>>>>>>> parallelism, >>>>>>>>>> >>>>>>>>> which >>>>>>> >>>>>>>> should be able to get merged into master very soon.) >>>>>>>>>> >>>>>>>>>>> Thanks! >>>>>>>>>>> >>>>>>>>>>> Best, >>>>>>>>>>> Yingyi >>>>>>>>>>> >>>>>>>>>>> On Tue, Dec 20, 2016 at 5:44 PM, mingda li < >>>>>>>>>>> >>>>>>>>>>> [email protected] >>>>>>>>>> >>>>>>>>> wrote: >>>>>>> >>>>>>>> Oh, sure. When we test the 100G multiple join, we find >>>>>>>>>>> AsterixDB >>>>>>>>>>> >>>>>>>>>> is >>>>>>> >>>>>>> slower >>>>>>>> >>>>>>>>> than Spark (but still faster than Pig and Hive). >>>>>>>>>>>> I can share with you the both plots: 1-10G.eps and 1-100G.eps. >>>>>>>>>>>> >>>>>>>>>>>> (We >>>>>>>>>>> >>>>>>>>>> will >>>>>>>> >>>>>>>>> only use 1-10G.eps in our paper). >>>>>>>>>> >>>>>>>>>>> And thanks for Ian's advice:* The dev list generally strips >>>>>>>>>>>> >>>>>>>>>>>> attachments. >>>>>>>>>>> Maybe you can just put the config inline? Or link to a >>>>>>>>>>> pastebin/gist?* >>>>>>>>>>> >>>>>>>>>> I know why you can't see the attachments. So I move the plots >>>>>>>>>> >>>>>>>>>>> with >>>>>>>>>>> >>>>>>>>>> two >>>>>>>> >>>>>>>>> documents to my Dropbox. >>>>>>>>>> >>>>>>>>>>> You can find the >>>>>>>>>>>> 1-10G.eps here: https://www.dropbox.com/s/ >>>>>>>>>>>> >>>>>>>>>>>> rk3xg6gigsfcuyq/1-10G.eps?dl=0 >>>>>>>>>>> 1-100G.eps here:https://www.dropbox.com/ >>>>>>>>>>> s/tyxnmt6ehau2ski/1-100G.eps >>>>>>>>>>> >>>>>>>>>> ? >>>>>>>>> >>>>>>>>> dl=0 >>>>>>>>>> >>>>>>>>>>> cc_conf.pdf here: https://www.dropbox.com/s/ >>>>>>>>>>>> >>>>>>>>>>>> y3of1s17qdstv5f/cc_conf.pdf? >>>>>>>>>>> dl=0 >>>>>>>>>>> >>>>>>>>>>>> CompleteQuery.pdf here: >>>>>>>>>>>> https://www.dropbox.com/s/lml3fzxfjcmf2c1/CompleteQuery. >>>>>>>>>>>> >>>>>>>>>>>> pdf?dl=0 >>>>>>>>>>> >>>>>>>>>> On Tue, Dec 20, 2016 at 4:40 PM, Tyson Condie < >>>>>>> >>>>>>>> [email protected]> >>>>>>>>>>> >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> Mingda: Please also share the numbers for 100GB, which show >>>>>>>>>>>> AsterixDB >>>>>>>>>>>> >>>>>>>>>>> not >>>>>>>>>> >>>>>>>>>>> quite doing as well as Spark. These 100GB results will not be >>>>>>>>>>>> in >>>>>>>>>>>> >>>>>>>>>>> our >>>>>>>> >>>>>>>>> submission version, since they’re not needed for the desired >>>>>>>>>> >>>>>>>>>>> message: >>>>>>>>>>>> >>>>>>>>>>> picking the right join order matters. Nevertheless, I’d like >>>>>>>>>> >>>>>>>>>>> to >>>>>>>>>>>> >>>>>>>>>>> get a >>>>>>> >>>>>>>> >>>>>>>> >
