Took me a while to (re)read and think about this. It seems to be getting more and more important as we see a growing interest from other ASF projects to get better integration with Ignite.
I think all these are very valid points. I'd say the integration with non-JVM apps aren't that high-priority, but I might be mistaken in my judgement. I wanted to specifically comment on #5 *Configuration* as UX is very important indeed. And as always, I am thinking that perhaps having a clean DSL might help with overcoming that hurdle: DSL can be generated by anything, it is humanly readable, and doesn't require much of the syntactic overhead. Cos On Thu, Apr 16, 2015 at 02:07PM, Vladimir Ozerov wrote: > Hi, > > I'd like to propose an idea of creating new Ignite component for > integration with other platforms such as .Net, Ruby, NodeJS, etc. > > Earlier in GridGain we had thin TCP clients for Java and .Net. They had > limited features and not-so-good performance (e.g. due to inability to > reliable map task to affinity node, etc.). For now Java client is in > open-source and is used only for internal purposes, and .Net client was > fully reworked to use JVM started in the same process instead of TCP and is > currently GridGain enterprise feature. > > But as we see growing interest to the product it makes sense to expose some > native interfaces for easy integration with our product from any platform. > > Let's discuss on how platforms integration architecture should be. > > *1. JVM placement.* > One of the most important points is how native platform will communicate > with JVM with started node. There are number of approaches to consider: > - Start JVM in the same process. This will allow for fast communication > between JVM and the native platform. The drawback of this approach is that > we can start only one JVM per process. As a result this solution might not > work in some environments (especially development ones), e.g. app servers > when multiple native applications run in the same process and each > application want to start a node with different JVM properties, or > multi-process environments when there is a coordinator process which spawns > child processes with limited lifecycle on demand (Apache, IIS, NodeJS, etc). > - Connect to JVM using some IPC mechanism (shared memory, pipes). This > approach might be a bit slower than the first one due to IPC overhead, but > still pretty fast. To implement it we probably will have to create some > intermediate management application which will start nodes in different > processes and provide handles for native application to connect with them. > This approach will be more flexible than the first one. > - Connect to JVM using TCP. This will be the slowest one, but offer even > greater flexibility, as we will be able to transaprently connect to nodes > even on another hosts. However, this raises some failover questions. > > In summary, I think we should choose "JVM in the same process" approach as > we already have experience with it and it is prooved to be functional and > performant, but create careful abstraction (facade) for node communication > logic, so that shmem/pipes/tcp approaches can be implemented easily if > needed without distirbing other components. > > *2. Data transfer and serialization.* > Another important point - how to pass data between Java and non-Java > platforms. Obviously we will have to provide some common format for both > interacting platforms, so that data serialized on one side could be > deserialized on another if needed. > For JVM-in-the-same-proc approach it make sense to organize data transfer > over offheap memory. Earlier we experimented with more sophisticated > mechanisms like "pin Java heap array in native platform -> write directly > to that array -> unpin", but this approach have some serious problems (like > JVM intrinsic method hangs while array is pinned), while not providing > significant perofrmance benefit. > So I think data transfer over offheap will be enough as this is simple and > reliable solution with acceptable performance. > Also we must remember that platforms may potentially have different > mechanisms for data transfer. E.g., sometimes we have to marshal object to > bytes before passing it to Java, sometimes we may just pass a pointer (e.g. > structs in C or .Net with known layout), etc.. We should be able to > potentially support all these cases > > In summary I propose to use offheap as a default implementation, while > still leaving a room for changing this if needed. E.g. instead of passing > offheap pointer + data length: > > void invokeOtherPlatform(long dataPointer, int dataLen); > > we should design it as: > > void invokeOtherPlatform(long pointer); > > where pointer will encode all information required for another platform to > read the data. E.g. it can be a pointer to memory region where the first 4 > bytes are data length and the rest are serialzied object. > > *3. Queries support* > Queries is one of the most demanded features of the product. But at the > moment it can only work with Java objects because it uses Java > serialization to get fields from it. > We will have to provide user a way to alter it somehow so that objects from > native platforms are supported as well. > Good candidate for this is IgniteCacheObjectProcessor interface which is > responsible for objects serialization. > We will have to investigate what should be done to let it's implementation > (either default or some custom) work with objects from other platforms. > > *4. Extensibility* > We will have a set of C/C++ interfaces exposing basic features (e.g. cache, > compute, queries, etc.). > But as we do not know in advance what implementors will want to do apart > from regular Java methods, it make sense to leave some extensibility > points. At the very first glance they may look as follows: > > interface Cache { > void get(void* inData, void* outData); // Regular cache operation. > bool put(void* outData); // Another regular cache operation. > ... > void invoke(int operationType, void* inData, void* outData); // > Extensibility point. > } > > In this example we define "invoke" method where use may pass virtually > anything. So, when some new functionallity is required he will implement it > in Java and inject it into Ignite somehow (e.g. through config) and > implement it in native platform. But he WILL NOT have to change any Ignite > C interfaces and rebuild them. > > *5. Configuration.* > Last, but not least - how to configure Ignite on other platforms. Currently > the only way to do that is Spring XML. This approach works well for Java > developers, but is not so good for others, because a developer who is not > familiar with Java/Spring will have to learn quite a bit things about them. > E.g. try configuring HashMap in Spring with an int key/value :-) Non-java > developers will have hard time doing this. > So probably we will have to let users use native mechanisms of their > platforms for configuration. This is not really critical from features > perspective, but will significantly improve user experience. > > Please share your thoughs and ideas about that. > > Vladimir.