Hello,
TP4 will not make a distinction between STANDARD (OLTP) and COMPUTER (OLAP)
execution models. In TP4, if a processing engine can convert a bytecode
Compilation into a working execution plan then that is all that matters.
TinkerPop does not need to concern itself with whether that execution plan is
“OLTP" or “OLAP" or with the semantics of its execution (function oriented,
iterator oriented, RDD-based, etc.). With that, here are 4 categories of
processors that I believe define the full spectrum of what we will be dealing
with:
1. Real-time single-threaded single-machine.
* This is STANDARD (OLTP) in TP3.
* This is the Pipes processor in TP4.
2. Real-time multi-threaded single-machine.
* This does not exist in TP3.
* We should provide an RxJava processor in TP4.
3. Near-time distributed multi-machine.
* This does not exist in TP3.
* We should provide an Akka processor in TP4.
4. Batch-time distributed multi-machine.
* This is COMPUTER (OLAP) in TP3 (Spark or Giraph).
* We should provide a Spark processor in TP4.
I’m not familiar with the specifics of the Flink, Apex, DataFlow, Samza, etc.
stream-based processors. However, I believe they can be made to work in
near-time or batch-time depending on the amount of data pulled from the
database. However, once we understand these technologies better, I believe we
should be able to fit them into the categories above.
In conclusion: Do these categories make sense to people? Terminology-wise --
Near-time? Batch-time? Are these distinctions valid?
Thank you,
Marko.
http://rredux.com <http://rredux.com/>