Hi there, I understand that dalvik is already running successfully on a variety of SMP chips. But it seems that most current efforts are focused on making it safe in a SMP environment, rather than making it a truly parallel VM. Parallelism of threads running in the same process is still limited (IMO mainly due to GC) though the underlying Linux is smart enough to balance the load of available cores.
Here are some potential improvements, which may be of some interests to some of you. Perhaps, google has already tackled these issues in Honeycomb? 1 Heap Contention dvmMalloc() locks the global gcHeapLock before doing any real job. This is not only a synchronization between GC and one mutator thread but also a mutex among virtually all mutator threads (as operator "new" is used everywhere). In a dual-core system, two logically separated mutator threads with enough free memory to burn should ideally run without any intervention on different cores. But the reality is whenever they "new", they could be blocked by each other. There are quite some allocator and GC algorithms designed for multiprocessors. Is dalvik going to embrace any of them? 2 GC critical sections Concurrent GC is a great feature of gingerbread as a large part of GC no longer needs to suspend all threads. This improves the responsiveness of mutator threads and also make it possible to run GC in parallel with mutator threads. However, other parts of concurrent GC still have to make sure that all mutator threads have been suspended. In these sections, I believe some GC work can be done in parallel but it has to be done very carefully. For example, I found that in dvmHeapMarkRootSet, dvmGcScanInternedStrings takes much longer time to finish than other functions invoked. Hence, we may want to create a worker thread to do this job and let it run in parallel with other parts of dvmHeapMarkRootSet. This may fully utilize all available cores and cut the time all mutator threads have to wait. 3 GC_EXTERNAL_ALLOC GC_EXPLICIT Is it really necessary to make GC non-preemptible (and non-parallel) when the reason for GC is GC_EXTERNAL_ALLOC or GC_EXPLICIT? In dvmCollectGarbageInternal, we could do some very simple change to make GC caused by GC_EXTERNAL_ALLOC or GC_EXPLICIT behave in a similar way as GC_CONCURRENT (i.e. dvmHeapScanMarkedObjects is executed while mutator threads are also runnable). In order to guarantee progress for such GC, we can still increase the priority of the current thread but as we have multiple cores, the kernel may schedule other threads on other cores. I already tried this and it seems to be working. In a multicore/multiprocessor system, global heap lock and dvmSuspendAllThreads are far more expensive than in a single core/ single processor system. Unfortunately, there are too many of them in current dalvik GC. Any comments will be highly appreciated. yang -- unsubscribe: [email protected] website: http://groups.google.com/group/android-porting
