Marko, this is great!  I always like it when you send out posts like this!

Best
DQ

Sent from my iPhone

> On Apr 5, 2016, at 4:43 PM, Marko Rodriguez <okramma...@gmail.com> wrote:
> 
> Hello,
> 
> With the imminent release of TinkerPop 3.2.0, during our week long code 
> freeze, I took 3.2.0 for a spin on a 4 node Blade cluster using the 
> Friendster graph which is composed of 125 million vertices and 2.5 billion 
> edges. TinkerPop 3.2.0 will release using Spark 1.6.1. Note that there were 
> some issues in the initial testing around SPARK_WORKER_INSTANCES and 
> SPARK_WORKER_CORES. The 1.5.2 settings I used previously were "too much" for 
> 1.6.1. I toned it down a bit and things work smoothly, and interestingly 
> enough, with seemingly less "firepower," we are getting better results 
> (speed-wise). Enjoy the results.
> 
> g.V().count() -- answer 125000000 (125 million vertices)
>       - TinkerPop 3.0.0.MX: 2.5 hours
>       - TinkerPop 3.0.0:      1.5 hours
>       - TinkerPop 3.1.1:      23 minutes
>       - TinkerPop 3.2.0:      6.8 minutes (Spark 1.5.2)
>       - TinkerPop 3.2.0:      5.5 minutes (Spark 1.6.1)
> 
> g.V().out().count() -- answer 2586147869 (2.5 billion length-1 paths (i.e. 
> edges))
>       - TinkerPop 3.0.0.MX: unknown
>       - TinkerPop 3.0.0:      2.5 hours
>       - TinkerPop 3.1.1:      1.1 hours
>       - TinkerPop 3.2.0:      13 minutes (Spark 1.5.2)
>       - TinkerPop 3.2.0:      12 minutes (Spark 1.6.1)
>       
> g.V().out().out().count() -- answer 640528666156 (640 billion length-2 paths)
>       - TinkerPop 3.0.0.MX: unknown
>       - TinkerPop 3.0.0:      unknown
>       - TinkerPop 3.1.1:      unknown
>       - TinkerPop 3.2.0:      55 minutes (Spark 1.5.2)
>       - TinkerPop 3.2.0:      50 minutes (Spark 1.6.1)
> 
> g.V().out().out().out().count() -- answer 215664338057221 (215 trillion 
> length 3-paths)
>       - TinkerPop 3.0.0.MX: 12.8 hours
>       - TinkerPop 3.0.0:      8.6 hours
>       - TinkerPop 3.1.1:      2.4 hours
>       - TinkerPop 3.2.0:      1.6 hours (Spark 1.5.2)
>       - TinkerPop 3.2.0:      1.5 hours (Spark 1.6.1)
> 
> g.V().out().out().out().out().count() -- answer 83841426570464575 (83 
> quadrillion length 4-paths)
>       - TinkerPop 3.0.0.MX: unknown
>       - TinkerPop 3.0.0:      unknown
>       - TinkerPop 3.1.1:      unknown
>       - TinkerPop 3.2.0:      unknown (Spark 1.5.2)
>       - TinkerPop 3.2.0:      2.1 hours (Spark 1.6.1)
> 
> g.V().out().out().out().out().count() -- answer -2280190503167902456 !! I 
> blew the long space -- 64-bit overflow.
>       - TinkerPop 3.0.0.MX: unknown
>       - TinkerPop 3.0.0:      unknown
>       - TinkerPop 3.1.1:      unknown
>       - TinkerPop 3.2.0:      unknown (Spark 1.5.2)
>       - TinkerPop 3.2.0:      2.8 hours (Spark 1.6.1)
> 
> Next, group()-step has been redesigned to be much more efficient in OLAP mode 
> when the by()-value traversal maintains a ReducingBarrierStep (e.g. count, 
> sum, max, min, fold, mean, ...). Thus, prior to this moment, something like:
> 
> g.V().group().by(outE().count()).by(count())
> 
>   // this is equivalent to g.V().map(outE().count()).groupCount(), 
>   // but I wanted to test group()'s new reducer model.
> 
> ….would have failed miserably on such a large graph. However, with TinkerPop 
> 3.2.0, because the second by() (the value traversal) maintains a 
> ReducingBarrierStep (count()), we get on-the-fly reductions which limits 
> memory usage and ensure that such group'ing traversal now work at scale in 
> OLAP.
> 
> g.V().group().by(outE().count()).by(count()) -- answer below. 
>       - TinkerPop 3.2.0: 12 minutes (Spark 1.6.1)
> 
> ==>[0:68889802, 1:14490104, 2:5924264, 3:3630690, 4:2520455, 5:1887641, 
> 6:1499489, 7:1235456, 8:1048559, 9:909576, 10:802183, 11:716357, 12:644813, 
> 13:590507, 14:542157, 15:501000, 16:465449, 17:434955, 18:407146, 19:383250, 
> 20:362687, 21:341529, 22:325269, 23:308506, 24:295382, 25:282257, 26:270540, 
> 27:259267, 28:248882, 29:241110, 30:240857, 31:221426, 32:213362, 33:206135, 
> 34:200053, 35:193185, 36:186947, 37:181301, 38:176271, 39:171148, 40:166312, 
> 41:161646, 42:156552, 43:153162, 44:148875, 45:145339, 46:141780, 47:138058, 
> 48:135479, 49:131795, 50:128793, 51:126391, 52:123254, 53:121081, 54:118758, 
> 55:115864, 56:113936, 57:110845, 58:108192, 59:106723, 60:104243, 61:102829, 
> 62:100759, 63:98617, 64:96827, 65:95385, 66:93629, 67:92324, 68:90519, 
> 69:88766, 70:87682, 71:85794, 72:84279, 73:83389, 74:81654, 75:80978, 
> 76:78906, 77:78126, 78:76857, 79:75987, 80:75312, 81:73354, 82:72901, 
> 83:71195, 84:70463, 85:69502, 86:68107, 87:66984, 88:65986, 89:65349, 
> 90:64568, 91:63761, 92:63283, 93:62092, 94:61089, 95:60195, 96:59655, 
> 97:58788, 98:57847, 99:56935, 100:57341, 101:55483, 102:54973, 103:54610, 
> 104:53367, 105:53699, 106:52948, 107:52060, 108:51386, 109:51032, 110:50442, 
> 111:49429, 112:48994, 113:48790, 114:48250, 115:47808, 116:47517, 117:47024, 
> 118:46299, 119:45855, 120:45529, 121:45262, 122:44453, 123:43738, 124:43768, 
> 125:43257, 126:42852, 127:41977, 128:41580, 129:41091, 130:41027, 131:40569, 
> 132:40019, 133:39416, 134:39448, 135:38935, 136:38228, 137:37863, 138:37641, 
> 139:37261, 140:36908, 141:36326, 142:36090, 143:35654, 144:35610, 145:34760, 
> 146:34946, 147:34355, 148:33948, 149:33946, 150:33341, 151:33193, 152:32877, 
> 153:32440, 154:32268, 155:31728, 156:31627, 157:30762, 158:30625, 159:30233, 
> 160:30345, 161:29881, 162:29851, 163:29523, 164:29081, 165:28844, 166:28402, 
> 167:28053, 168:27706, 169:27623, 170:27502, 171:27156, 172:27112, 173:26538, 
> 174:26578, 175:26187, 176:25951, 177:25572, 178:25297, 179:25441, 180:24653, 
> 181:24935, 182:24478, 183:24262, 184:23926, 185:24006, 186:23499, 187:23317, 
> 188:22860, 189:22704, 190:22441, 191:22565, 192:22164, 193:22105, 194:21728, 
> 195:21870, 196:21431, 197:21395,
> ...
> 
> Take care,
> Marko.
> 
> http://markorodriguez.com
> -- 
> You received this message because you are subscribed to the Google Groups 
> "Gremlin-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to gremlin-users+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/gremlin-users/0F921BDF-E8C6-4A90-B479-68090E8AAEC5%40gmail.com.
> For more options, visit https://groups.google.com/d/optout.

Reply via email to