On Thu, Nov 10, 2016 at 11:43 AM Tudor Girba <[email protected]> wrote:
> Hi Igor, > > I am happy to see you getting active again. The next step is to commit > code at the rate you reply emails. I’d be even happier :). > > > aouch that was not very nice.... I agree with Igor and Phil, there is no genuine interest by the community for optimising Pharo for big data. Which makes sense because coders that care a lot about performance stick with C/C++. Not that I blame them. You cant have your cake and eat it too. No idea why you would want to add 128+ GB of RAM on a computer, its not as if CPUs are powerful enough to deal with such a massive amount of data even if you do your coding at C level. I know because I am working daily with 3d graphics. Foremost CPUs have lost the war, GPUs have dominated for almost a decade now especially in the area of large parallelism , its quite easy for a cheap GPU nowdays to outperform a CPU by 10 times , and some expensive ones can even 100 times faster than the fastest CPU. But that is for doing the same calculation over a very large data set. If you go down that path you need OpenCL or CUDA support in Pharo. Assuming you wanna do it all in Pharo. Because modern GPUs are so generic in functionality that are used in many areas that having nothing to do with graphics and are very popular especially for physical simulations which are cases that data can reach easily in TBs or even PBs. Also a solution that I am implementing with CPPBridge would make sense here, a shared memory area that lives outside the VM memory so it cannot be garbage collected but still inside the Pharo process for Pharo to have direct access to it with no compromise on performance. Also being shared means that multiple instances of Pharo can have direct access to it giving you true parallelism. If you want to get the comforts of pharo including GC then you move a portion of the data to VM by copying data from the shared memory to Pharo objects and of course erasing or overwriting the data at the shared memory side so you dont waste RAM. You can also delegate which pharo instance deals with what portion of the shared memory so you can optimise the use of multiple cores, data processing that will benefit from GPUs pararrelism should be moved to GPUs with the appropriate Pharo library. The memory mapped file storing the share memory will be stripping any meta data and storing the data in its most compact format, while data that needs to be more flexible and more high level can be stored inside a Pharo image. If 10 Pharos execute at the same time one of those instance can be performing the role of manager of streaming data from hard drive to shared memory in the background without affecting the performance of other Pharos. This will give you the ability to deal with TBs of data and take advantage old computers with little memory. Out of all that I will be materializing the shared memory part , the protocol and the memory mapped file that will save the shared memory. Because I dont need the rest. Of course here comes the debate why do it in Pharo and instead use a C/C++ library or C support for CUDA/OpenCL and let pharo just be in the driving seat perform the role of manager. This is how Python is used by modern scientists. C++ libraries driven by Python scripting. Pharo can do the same. I dont believe optimising GC will be an ideal solution. It is not even necessary.
