yjshen commented on issue #1708:
URL: 
https://github.com/apache/arrow-datafusion/issues/1708#issuecomment-1027779249


   After some code/doc checking into the existing systems, the three systems' 
row layouts are:
   
   **Postgresql:**  var-length tuple
   - null-bits first (byte aligned)
   - store **all** attributes sequentially, 
       - add **extra padding if needed** before each attribute
           -  E.g. table A (bool, char, int32), no padding between bool and 
char since they are both 1 byte aligned, but 2 bytes padding after char and 
before int32, since int32 is 4 bytes aligned.
       -  store var-length attribute in place (length first, then content; if 
the value is not too big/"TOAST" in its term). 
   - **Value access:** most difficult, its O(n) of complexity since it needs to 
access all previous attr of a tuple to calculate padding/length until the start 
offset of an attr can be deduced.
       
   Check [Data Alignment in 
PostgreSQL](https://www.enterprisedb.com/postgres-tutorials/data-alignment-postgresql),
 [Column Storage 
Internals](https://momjian.us/main/blogs/pgblog/2017.html#March_15_2017), 
[CodeSample in Page16](https://momjian.us/main/writings/pgsql/inside_shmem.pdf) 
for more details.
   
   **DuckDB:** fixed-length tuple
   - null-bits first (byte aligned)
   - store fixed-sized attributes sequentially. For var-length attributes, 
store an 8-byte pointer (on x64)
        - **no padding between** attributes
        - var-length attribute pointer
              - point to the store called "row heap".
              - In the string heap, var length attributes/strings for one tuple 
are stored continuously.
   - **Value access:** An extra `vector<idx_t> offsets` is employed to achieve 
O(1) simple attr access, and O(1 + 1) var-len-attr access.
        
   Check [Source 
Code](https://github.com/duckdb/duckdb/blob/master/src/common/types/row_layout.cpp#L32-L66)
 and [a related blog post/external sorting 
section](https://duckdb.org/2021/08/27/external-sorting.html) for more details.
   
   **SparkSQL:** var-length tuple 
   - null-bits first (8-byte aligned)
   - store each attribute sequentially, **8 bytes aligned for each** attribute; 
       - for var-length attribute, pack (offset+length) into 8 bytes and store 
in place, store the actual var-length attributes after all fixed fields. (the 
var-len-attr itself is again 8 bytes aligned)
   - **Value access:** No extra structure needed, O(1) for simple attr access, 
O(1+1) for var-len-attr access.
       
   Check [Source 
Code](https://github.com/apache/spark/blob/master/sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/UnsafeRow.java#L46-L61)
 for more details.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to