I believe that we are onto an exciting prospect with this idea. Here are the specific needs that our company could foresee, given the theme:
1. **Transition to Paimon Tables from Hive ODS Tables**: Our current system boasts a significant number of Hive ODS tables, with partitions set daily. Each of these partitions encapsulates comprehensive business data sourced directly from MySQL. We are contemplating an in-place transition to Paimon tables. The rationale behind this move is twofold: First, it would obviate the need to modify the SQL code amidst the existing plethora of Hive batch processing logic. Secondly, this transition promises the advantage of real-time data access, shrinking the delay to mere minutes and also adding the benefit of stream reading capabilities. 2. **Integration with Historical Hive Partitions**: The Hive system has been an integral part of our data structure, with over a thousand partitions to its credit. Ideally, a view table that can meld the functionalities of a Paimon table and the vastness of historical Hive partitions would be a valuable addition. In such a scenario, users interacting with this view table would be directed to the Paimon tag when a tag is present, and to the historical Hive partitions in its absence. 3. **Tag-Based Processing with 'dt'**: We employ a tagging system rooted in the 'dt' parameter. Keeping this in mind, processing using these tags should ideally support a range of operations, such as "between and", comparative functions like greater than or less than, and even group by operations centered around these tags. To illustrate, the system should be adept at handling queries akin to: ```SQL SELECT dt, COUNT(*) FROM table WHERE dt BETWEEN a AND b GROUP BY dt ``` Best, ZhuoyuChen Jingsong Li <[email protected]> 于2023年8月25日周五 13:58写道: > Hi, devs. > > Now, Pailin supports tags, which provide a snapshot view to time travel, > this can be something similar to partition table to replace hive full > partitioned table and incremental partitioned table. > > But, this requires uses to change their sql to use time travel, and it is > not good to use time travel in hive sql now. > > So, I plan to create a new feature view table, we can create view table to > mapping non-partitioned table to partitioned table, it’s partition field is > tag. This feature can let Pailin table 100% compatible to old hive table. > > What do you think? > > Any requirements? > > Best, > Jingsong >
