Adapting TimSort into AsterixDB/Hyracks

2017-10-27 Thread Chen Luo
Hi devs,

I have adapted the TimSort algorithm used in JDK (java.util.TimSort) into
Hyracks, which gives 10-20% performance improvements on random data. It
will be more useful if the input data is partially sorted, e.g., primary
keys fetched from secondary index scan, which I haven't got time to
experiment with.

*Before going any further, is it legal to adapt some algorithm
implementation from JDK into our codebase? *I saw the JDK implementation
itself is adopted from
http://svn.python.org/projects/python/trunk/Objects/listsort.txt as well.

Best regards,
Chen Luo


Weekly status

2017-10-27 Thread abdullah alamoudi
Hi Everyone,
I will not be able to attend this week's meeting. Here is my status:

1. Make Active job resume attempt on the same suspend/resume thread
This prevents Metadata locking issue explained in the commit message.
2. Set Default dataverse in MetadataProvider and prevents null values inside 
AMutableString.
This prevents NPE during compilation in certain contexts. See commit 
message again.
3. More performance tests. specifically testing the behavior of flushes/merges 
of prefix merge policy
3. Reviewed changes:
- BloomFilter check change.
- Refactor Waiting For Dataset IO Ops
- Feed pipeline refactoring for SQL++


That is about it,
Abdullah.