sarutak commented on PR #55912:
URL: https://github.com/apache/spark/pull/55912#issuecomment-4552548972

   @cloud-fan 
   Window rewrite cannot express important cases
   - Tolerance: RANGE BETWEEN <expr> PRECEDING requires a constant; 
row-dependent left.t - tolerance is not expressible as a window frame boundary
   - Residual pair-correlated predicates: Conditions referencing both left and 
right columns cannot be evaluated inside a window frame
   
   Especially, tolerance is commonly used in practice (financial tick matching 
within N seconds, IoT sensor correlation within a time window, etc.)
   
   Also, other popular data processing systems have dedicated operator or code 
path for AS-OF join.
   
   * ClickHouse
     * 
https://github.com/ClickHouse/ClickHouse/blob/master/src/Interpreters/RowRefs.h
     * 
https://github.com/ClickHouse/ClickHouse/blob/master/src/Interpreters/RowRefs.cpp
   * DuckDB
     * 
https://github.com/duckdb/duckdb/blob/main/src/execution/operator/join/physical_asof_join.cpp
   * QuestDB
     * 
https://github.com/questdb/questdb/blob/master/core/src/main/java/io/questdb/griffin/engine/join/AsOfJoinLightRecordCursorFactory.java
   * Polars
     * 
https://github.com/pola-rs/polars/tree/main/crates/polars-ops/src/frame/join/asof
   * Pandas
     * 
https://github.com/pandas-dev/pandas/blob/main/pandas/core/reshape/merge.py
   * Snowflake
     * https://www.greybeam.ai/blog/snowflake-asof-join


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to