Adapting TimSort into AsterixDB/Hyracks
Hi devs, I have adapted the TimSort algorithm used in JDK (java.util.TimSort) into Hyracks, which gives 10-20% performance improvements on random data. It will be more useful if the input data is partially sorted, e.g., primary keys fetched from secondary index scan, which I haven't got time to experiment with. *Before going any further, is it legal to adapt some algorithm implementation from JDK into our codebase? *I saw the JDK implementation itself is adopted from http://svn.python.org/projects/python/trunk/Objects/listsort.txt as well. Best regards, Chen Luo
Weekly status
Hi Everyone, I will not be able to attend this week's meeting. Here is my status: 1. Make Active job resume attempt on the same suspend/resume thread This prevents Metadata locking issue explained in the commit message. 2. Set Default dataverse in MetadataProvider and prevents null values inside AMutableString. This prevents NPE during compilation in certain contexts. See commit message again. 3. More performance tests. specifically testing the behavior of flushes/merges of prefix merge policy 3. Reviewed changes: - BloomFilter check change. - Refactor Waiting For Dataset IO Ops - Feed pipeline refactoring for SQL++ That is about it, Abdullah.