johnyangk commented on issue #122: [NEMO-213] Use Beam's DoFnRunners to execute DoFn URL: https://github.com/apache/incubator-nemo/pull/122#issuecomment-428147686 TPC-H job completion times I got are (10 h1.4xlarge, Scale Factor=100): | My branch (master + some fixes) | My branch + This branch (merged) -- | -- | -- Q3 | 4mins, 51sec | 5mins, 13sec Q4 | 3mins, 22sec | 3mins, 45sec Q5 | 7mins, 24sec | 6mins, 34sec Q6 | 2mins, 13sec | 2mins, 21sec Q10 | 3mins, 33sec | 3mins, 50sec Q12 | 2mins, 55sec | 3mins, 14sec Q13 | 3mins, 21sec | 3mins, 42sec Q14 | 2mins, 9sec | 2mins, 21sec This PR seems to add around 10% overhead by using the `WindowedValue` wrappers rather than raw data objects, which I think is not so bad also considering that we're still mostly faster than the Spark runner. (@seojangho - what do you think about this?) I'll review the code now.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
