Dear Mike, Thank you for providing the paper on the storage architecture of AsterixDB. After a thorough review, I now have a much clearer understanding of the storage architecture and the underlying concepts.
As I transition into the codebase, I am looking for guidance on how to implement specific operational workflows within Hyracks. Specifically, I would like to understand the best practices for: - Parallel Processing: How to manage the concurrent fetching of multiple relevant R-tree roots from disk partitions based on a query. - Component Management: Handling anti-matter trees and utilizing Bloom filters across disk components. - Cursor Implementation: Assigning sub-cursors at each disk component for data processing. - Buffer Cache Interaction: The logic for fetching these components into the buffer cache for active processing. Could you please point me toward the relevant modules or classes in the codebase that demonstrate these patterns? Any further documentation or advice on navigating the data flow for these structures would be greatly appreciated. Best regards, Tejesh Sakhamuri
