On 12/22/16 2:51 PM, Yingyi Bu wrote:
Yes, that would work. However, if the cluster only has 128 available cores, it will use 128-way (instead of 1000) parallelism for the computation. Best, Yingyi On Thu, Dec 22, 2016 at 12:40 PM, Mike Carey <[email protected]> wrote:So to use maxi-storage parallelism for scans but maxi-core parallelism subsequently, one would just specify a large positive value >= the number of available cores? (E.g., 10000) On 12/22/16 11:37 AM, Yingyi Bu wrote:No need to reload data anymore :-) Best, Yingyi On Thu, Dec 22, 2016 at 11:36 AM, Yingyi Bu <[email protected]> wrote: Indeed, the change was merged yesterday.If you grab the latest master, the computation parallelism can be set by the parameter compiler.parallelism: -- 0, the default, means to use the storage parallelism as computation parallelism -- negative value, means to use all available cores in the cluster -- positive value, means to that number of cores, but it will fall to back to use all available cores if the number is too large. Best, Yingyi On Thu, Dec 22, 2016 at 10:59 AM, Mike Carey <[email protected]> wrote: It would definitely also be fun to see how the newest more flexibleparallelism stuff handles this! (Where you'd specify storage parallelism based on drives, and compute parallelism based on cores, both spread across all of the cluster's resources.) On 12/22/16 10:57 AM, Yingyi Bu wrote: Mingda,There is one more things that you can do without the need to reload your data: -- Enlarge the buffercache memory budget to 8GB so that the datasets can fit into the buffer cache: "storage.buffercache.size": 536870912 -> "storage.buffercache.size": 8589934592 You don't need to reload data but only need to restart the AsterixDB instance. Thanks! Best, Yingyi On Wed, Dec 21, 2016 at 9:22 PM, Mike Carey <[email protected]> wrote: Nice!!On Dec 21, 2016 8:43 PM, "Yingyi Bu" <[email protected]> wrote: Cool, thanks, Mingda!Look forward to the new numbers! Best, Yingyi On Wed, Dec 21, 2016 at 7:13 PM, mingda li <[email protected]> wrote:Dear Yingyi,Thanks for your suggestion. I have reset the AsterixDB and retest AsterixDBusing new environment on 100G, 10G. The efficiency for all the tests(goodand bad order) have been all improved to twice speed.I will finish all the tests and update the result later. Bests, Mingda On Tue, Dec 20, 2016 at 10:20 PM, Yingyi Bu <[email protected]> wrote:Hi Mingda,I think that in your setting, a better configuration forAsterixDBmight be to use 64 partitions, i.e., 4 cores *16.1. To achieve that, you have to have 4 iodevices on each NC,e.g.:iodevices=/home/clash/asterixStorage/asterixdb5/red16-->iodevices=/home/clash/asterix Storage/asterixdb5/red16-1, iodevices=/home/clash/asterixStorage/asterixdb5/red16-2, iodevices=/home/clash/asterixStorage/asterixdb5/red16-3, iodevices=/home/clash/asterixStorage/asterixdb5/red16-4 2. Assume you have 64 partitions, 31.01G/64 ~= 0.5G. That means you'dbetter have 512MB memory budget for each joiner so as to make thejoinmemory-resident.To achieve that, in the cc section in the cc.conf, youcouldadd:compiler.joinmemory=5368709123. For the JVM setting, 1024MB is too small for the NC. In the shared NC section in cc.conf, you can add: jvm.args=-Xmx16G 4. For Pig and Hive, you can set the maximum mapper/reducer numbersin the MapReduce configuration, e.g., at most 4 mappers per machine andatmost 4 reducers per machine.5. I'm not super-familiar with hyper threading, but it mightbe worthtrying 8 partitions per machine, i.e., 128 partitions in total.To validate if the new settings work, you can go to the admin/clusterpage to double check.Pls keep us updated and let us know if you run into any issue. Thanks! Best, Yingyi On Tue, Dec 20, 2016 at 9:26 PM, mingda li <[email protected] wrote:Dear Yingyi,1. For the returned of :http://<master node>:19002/admin/cluster { "cc": { "configUri": "http://scai01.cs.ucla.edu: 19002/admin/cluster/cc/config", "statsUri": "http://scai01.cs.ucla.edu: 19002/admin/cluster/cc/stats", "threadDumpUri": "http://scai01.cs.ucla.edu:19002/admin/cluster/cc/threaddump" }, "config": { "api.port": 19002, "cc.java.opts": "-Xmx1024m", "cluster.partitions": { "0": "ID:0, Original Node: red16, IODevice: 0, Active Node:red16","1": "ID:1, Original Node: red15, IODevice: 0, ActiveNode:red15","2": "ID:2, Original Node: red14, IODevice: 0, ActiveNode:red14","3": "ID:3, Original Node: red13, IODevice: 0, ActiveNode:red13","4": "ID:4, Original Node: red12, IODevice: 0, ActiveNode:red12","5": "ID:5, Original Node: red11, IODevice: 0, ActiveNode:red11","6": "ID:6, Original Node: red10, IODevice: 0, ActiveNode:red10","7": "ID:7, Original Node: red9, IODevice: 0, ActiveNode:red9","8": "ID:8, Original Node: red8, IODevice: 0, ActiveNode:red8","9": "ID:9, Original Node: red7, IODevice: 0, ActiveNode:red7","10": "ID:10, Original Node: red6, IODevice: 0,Active Node:red6","11": "ID:11, Original Node: red5, IODevice: 0, Active Node:red5","12": "ID:12, Original Node: red4, IODevice: 0, Active Node:red4","13": "ID:13, Original Node: red3, IODevice: 0, Active Node:red3","14": "ID:14, Original Node: red2, IODevice: 0, Active Node:red2","15": "ID:15, Original Node: red, IODevice: 0, Active Node:red"},"compiler.framesize": 32768, "compiler.groupmemory": 33554432, "compiler.joinmemory": 33554432, "compiler.pregelix.home": "~/pregelix", "compiler.sortmemory": 33554432, "core.dump.paths": { "red": "/home/clash/asterixStorage/ asterixdb5/red/coredump","red10": "/home/clash/asterixStorage/asterixdb5/red10/coredump", "red11": "/home/clash/asterixStorage/ asterixdb5/red11/coredump", "red12": "/home/clash/asterixStorage/ asterixdb5/red12/coredump", "red13": "/home/clash/asterixStorage/ asterixdb5/red13/coredump", "red14": "/home/clash/asterixStorage/ asterixdb5/red14/coredump", "red15": "/home/clash/asterixStorage/ asterixdb5/red15/coredump", "red16": "/home/clash/asterixStorage/ asterixdb5/red16/coredump", "red2": "/home/clash/asterixStorage/ asterixdb5/red2/coredump","red3": "/home/clash/asterixStorage/asterixdb5/red3/coredump","red4": "/home/clash/asterixStorage/asterixdb5/red4/coredump","red5": "/home/clash/asterixStorage/asterixdb5/red5/coredump","red6": "/home/clash/asterixStorage/asterixdb5/red6/coredump","red7": "/home/clash/asterixStorage/asterixdb5/red7/coredump","red8": "/home/clash/asterixStorage/asterixdb5/red8/coredump","red9": "/home/clash/asterixStorage/asterixdb5/red9/coredump"},"feed.central.manager.port": 4500, "feed.max.threshold.period": 5, "feed.memory.available.wait.timeout": 10, "feed.memory.global.budget": 67108864, "feed.pending.work.threshold": 50, "feed.port": 19003, "instance.name": "DEFAULT_INSTANCE", "log.level": "WARNING", "max.wait.active.cluster": 60, "metadata.callback.port": 0, "metadata.node": "red16", "metadata.partition": "ID:0, Original Node: red16, IODevice:0, Active Node: red16", "metadata.port": 0,"metadata.registration.timeout.secs": 60, "nc.java.opts": "-Xmx1024m", "node.partitions": { "red": ["ID:15, Original Node: red, IODevice: 0, Active Node:red"],"red10": ["ID:6, Original Node: red10, IODevice: 0, ActiveNode: red10"],"red11": ["ID:5, Original Node: red11, IODevice: 0,ActiveNode: red11"],"red12": ["ID:4, Original Node: red12, IODevice: 0,ActiveNode: red12"],"red13": ["ID:3, Original Node: red13, IODevice: 0,ActiveNode: red13"],"red14": ["ID:2, Original Node: red14, IODevice: 0,ActiveNode: red14"],"red15": ["ID:1, Original Node: red15, IODevice: 0,ActiveNode: red15"],"red16": ["ID:0, Original Node: red16, IODevice: 0,ActiveNode: red16"],"red2": ["ID:14, Original Node: red2, IODevice: 0,ActiveNode: red2"], "red3": ["ID:13, Original Node: red3, IODevice: 0,ActiveNode: red3"], "red4": ["ID:12, Original Node: red4, IODevice: 0,ActiveNode: red4"], "red5": ["ID:11, Original Node: red5, IODevice: 0,ActiveNode: red5"], "red6": ["ID:10, Original Node: red6, IODevice: 0,ActiveNode: red6"], "red7": ["ID:9, Original Node: red7, IODevice: 0,ActiveNode: red7"], "red8": ["ID:8, Original Node: red8, IODevice: 0,ActiveNode: red8"], "red9": ["ID:7, Original Node: red9, IODevice: 0,ActiveNode: red9"] },"node.stores": { "red": ["/home/clash/asterixStorage/ asterixdb5/red/storage"],"red10": ["/home/clash/asterixStorage/asterixdb5/red10/storage"], "red11": ["/home/clash/asterixStorage/ asterixdb5/red11/storage"], "red12": ["/home/clash/asterixStorage/ asterixdb5/red12/storage"], "red13": ["/home/clash/asterixStorage/ asterixdb5/red13/storage"], "red14": ["/home/clash/asterixStorage/ asterixdb5/red14/storage"], "red15": ["/home/clash/asterixStorage/ asterixdb5/red15/storage"], "red16": ["/home/clash/asterixStorage/ asterixdb5/red16/storage"], "red2": ["/home/clash/asterixStorage/ asterixdb5/red2/storage"], "red3": ["/home/clash/asterixStorage/ asterixdb5/red3/storage"], "red4": ["/home/clash/asterixStorage/ asterixdb5/red4/storage"], "red5": ["/home/clash/asterixStorage/ asterixdb5/red5/storage"], "red6": ["/home/clash/asterixStorage/ asterixdb5/red6/storage"], "red7": ["/home/clash/asterixStorage/ asterixdb5/red7/storage"], "red8": ["/home/clash/asterixStorage/ asterixdb5/red8/storage"], "red9": ["/home/clash/asterixStorage/ asterixdb5/red9/storage"]},"plot.activate": false, "replication.enabled": false, "replication.factor": 2, "replication.log.batchsize": 4096, "replication.log.buffer.numpages": 8, "replication.log.buffer.pagesize": 131072, "replication.max.remote.recovery.attempts": 5, "replication.timeout": 30, "storage.buffercache.maxopenfiles": 2147483647, "storage.buffercache.pagesize": 131072, "storage.buffercache.size": 536870912, "storage.lsm.bloomfilter.falsepositiverate": 0.01, "storage.memorycomponent.globalbudget": 536870912, "storage.memorycomponent.numcomponents": 2, "storage.memorycomponent.numpages": 256, "storage.memorycomponent.pagesize": 131072, "storage.metadata.memorycomponent.numpages": 256, "transaction.log.dirs": { "red": "/home/clash/asterixStorage/ asterixdb5/red/txnlog","red10": "/home/clash/asterixStorage/asterixdb5/red10/txnlog", "red11": "/home/clash/asterixStorage/asterixdb5/red11/txnlog","red12": "/home/clash/asterixStorage/asterixdb5/red12/txnlog","red13": "/home/clash/asterixStorage/asterixdb5/red13/txnlog","red14": "/home/clash/asterixStorage/asterixdb5/red14/txnlog","red15": "/home/clash/asterixStorage/asterixdb5/red15/txnlog","red16": "/home/clash/asterixStorage/asterixdb5/red16/txnlog","red2": "/home/clash/asterixStorage/asterixdb5/red2/txnlog","red3": "/home/clash/asterixStorage/ asterixdb5/red3/txnlog", "red4": "/home/clash/asterixStorage/ asterixdb5/red4/txnlog", "red5": "/home/clash/asterixStorage/ asterixdb5/red5/txnlog", "red6": "/home/clash/asterixStorage/ asterixdb5/red6/txnlog", "red7": "/home/clash/asterixStorage/ asterixdb5/red7/txnlog", "red8": "/home/clash/asterixStorage/ asterixdb5/red8/txnlog", "red9": "/home/clash/asterixStorage/ asterixdb5/red9/txnlog" },"txn.commitprofiler.reportinterval": 5, "txn.job.recovery.memorysize": 67108864, "txn.lock.escalationthreshold": 1000, "txn.lock.shrinktimer": 5000, "txn.lock.timeout.sweepthreshold": 10000, "txn.lock.timeout.waitthreshold": 60000, "txn.log.buffer.numpages": 8, "txn.log.buffer.pagesize": 131072, "txn.log.checkpoint.history": 0, "txn.log.checkpoint.lsnthreshold": 67108864, "txn.log.checkpoint.pollfrequency": 120, "txn.log.partitionsize": 268435456, "web.port": 19001, "web.queryinterface.port": 19006, "web.secondary.port": 19005 }, "fullShutdownUri": "http://scai01.cs.ucla.edu:19002/admin/shutdown?all=true", "metadata_node": "red16", "ncs": [ { "configUri": "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red15/config ", "node_id": "red15", "partitions": [{ "active": true, "partition_id": "partition_1" }], "state": "ACTIVE", "statsUri": "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red15/stats", "threadDumpUri": "http://scai01.cs.ucla.edu:19002/admin/cluster/node/ red15/threaddump"},{"configUri": "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red14/config ", "node_id": "red14", "partitions": [{ "active": true, "partition_id": "partition_2" }], "state": "ACTIVE", "statsUri": "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red14/stats", "threadDumpUri": "http://scai01.cs.ucla.edu:19002/admin/cluster/node/ red14/threaddump"},{"configUri": "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red16/config ", "node_id": "red16", "partitions": [{ "active": true, "partition_id": "partition_0" }], "state": "ACTIVE", "statsUri": "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red16/stats", "threadDumpUri": "http://scai01.cs.ucla.edu:19002/admin/cluster/node/ red16/threaddump"},{"configUri": "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red11/config ", "node_id": "red11", "partitions": [{ "active": true, "partition_id": "partition_5" }], "state": "ACTIVE", "statsUri": "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red11/stats", "threadDumpUri": "http://scai01.cs.ucla.edu:19002/admin/cluster/node/ red11/threaddump"},{"configUri": "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red10/config ", "node_id": "red10", "partitions": [{ "active": true, "partition_id": "partition_6" }], "state": "ACTIVE", "statsUri": "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red10/stats", "threadDumpUri": "http://scai01.cs.ucla.edu:19002/admin/cluster/node/ red10/threaddump"},{"configUri": "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red13/config ", "node_id": "red13", "partitions": [{ "active": true, "partition_id": "partition_3" }], "state": "ACTIVE", "statsUri": "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red13/stats", "threadDumpUri": "http://scai01.cs.ucla.edu:19002/admin/cluster/node/ red13/threaddump"},{"configUri": "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red12/config ", "node_id": "red12", "partitions": [{ "active": true, "partition_id": "partition_4" }], "state": "ACTIVE", "statsUri": "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red12/stats", "threadDumpUri": "http://scai01.cs.ucla.edu:19002/admin/cluster/node/ red12/threaddump"},{"configUri": "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red6/config", "node_id": "red6", "partitions": [{ "active": true, "partition_id": "partition_10" }], "state": "ACTIVE", "statsUri": "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red6/stats", "threadDumpUri": "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red6/ threaddump"}, {"configUri": "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red/config", "node_id": "red", "partitions": [{ "active": true, "partition_id": "partition_15" }], "state": "ACTIVE", "statsUri": "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red/stats", "threadDumpUri": "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red/thre addump "}, {"configUri": "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red5/config", "node_id": "red5", "partitions": [{ "active": true, "partition_id": "partition_11" }], "state": "ACTIVE", "statsUri": "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red5/stats", "threadDumpUri": "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red5/ threaddump"}, {"configUri": "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red8/config", "node_id": "red8", "partitions": [{ "active": true, "partition_id": "partition_8" }], "state": "ACTIVE", "statsUri": "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red8/stats", "threadDumpUri": "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red8/ threaddump"}, {"configUri": "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red7/config", "node_id": "red7", "partitions": [{ "active": true, "partition_id": "partition_9" }], "state": "ACTIVE", "statsUri": "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red7/stats", "threadDumpUri": "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red7/ threaddump"}, {"configUri": "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red2/config", "node_id": "red2", "partitions": [{ "active": true, "partition_id": "partition_14" }], "state": "ACTIVE", "statsUri": "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red2/stats", "threadDumpUri": "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red2/ threaddump"}, {"configUri": "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red4/config", "node_id": "red4", "partitions": [{ "active": true, "partition_id": "partition_12" }], "state": "ACTIVE", "statsUri": "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red4/stats", "threadDumpUri": "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red4/ threaddump"}, {"configUri": "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red3/config", "node_id": "red3", "partitions": [{ "active": true, "partition_id": "partition_13" }], "state": "ACTIVE", "statsUri": "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red3/stats", "threadDumpUri": "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red3/ threaddump"}, {"configUri": "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red9/config", "node_id": "red9", "partitions": [{ "active": true, "partition_id": "partition_7" }], "state": "ACTIVE", "statsUri": "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red9/stats", "threadDumpUri": "http://scai01.cs.ucla.edu:19002/admin/cluster/node/red9/ threaddump"} ],"shutdownUri": "http://scai01.cs.ucla.edu:190 02/admin/shutdown ","state": "ACTIVE", "versionUri": "http://scai01.cs.ucla.edu:19002/admin/version"} 2.Catalog_return:2.28G catalog_sales:31.01G inventory:8.63G 3.As for Pig and Hive, I always use the default configuration. I didn'tsetthe partition things for them. And for Spark, we use 200partitions,which may be improved and just not bad. For AsterixDB, I also set theclusterusing default value of partition and JVM things (I didn't manuallysetthese parameters).On Tue, Dec 20, 2016 at 5:58 PM, Yingyi Bu <[email protected]> wrote:Mingda,1. Can you paste the returned JSON of http://<masternode>:19002/admin/cluster at your side? (Pls replace <master node>withtheactual master node name or IP)2. Can you list the individual size of each dataset involved inthe query, e.g., catalog_returns, catalog_sales, and inventory? (Iassume100GB is the overall size?)3. Do Spark/Hive/Pig saturate all CPUs on all machines,i.e.,howmanypartitions are running on each machine? (It seems that yourAsterixDBconfiguration wouldn't saturate all CPUs for queries --- in thecurrentAsterixDB master, the computation parallelism is set to be thesameasthestorage parallelism (i.e., the number of iodevices on each NC).I'vesubmitted a new patch that allow flexible computation parallelism, whichshould be able to get merged into master very soon.)Thanks! Best, Yingyi On Tue, Dec 20, 2016 at 5:44 PM, mingda li < [email protected]wrote:Oh, sure. When we test the 100G multiple join, we findAsterixDBisslowerthan Spark (but still faster than Pig and Hive).I can share with you the both plots: 1-10G.eps and 1-100G.eps. (Wewillonly use 1-10G.eps in our paper).And thanks for Ian's advice:* The dev list generally stripsattachments.Maybe you can just put the config inline? Or link to a pastebin/gist?*I know why you can't see the attachments. So I move the plotswithtwodocuments to my Dropbox.You can find the1-10G.eps here: https://www.dropbox.com/s/ rk3xg6gigsfcuyq/1-10G.eps?dl=01-100G.eps here:https://www.dropbox.com/ s/tyxnmt6ehau2ski/1-100G.eps?dl=0cc_conf.pdf here: https://www.dropbox.com/s/y3of1s17qdstv5f/cc_conf.pdf?dl=0CompleteQuery.pdf here: https://www.dropbox.com/s/lml3fzxfjcmf2c1/CompleteQuery. pdf?dl=0On Tue, Dec 20, 2016 at 4:40 PM, Tyson Condie <[email protected]>wrote:Mingda: Please also share the numbers for 100GB, which showAsterixDBnot quite doing as well as Spark. These 100GB results will not beinoursubmission version, since they’re not needed for the desiredmessage: picking the right join order matters. Nevertheless, I’d like to get a
