duongcongtoai commented on issue #9394:
URL: https://github.com/apache/datafusion/issues/9394#issuecomment-2212342175
I took a look at Postgres cross join behavior, it looks like the order of
the result is not deterministic either (meaning it feel free to decide which
table is small and should be materialized inside memory, and the order of the
output will be based on this decision)
Example:
create table_a with 3 rows
table_b and table_b_small with the same schema, table_b with 10k rows and
table_b_small with 2 rows
```
CREATE TABLE table_a (
id SERIAL PRIMARY KEY,
name VARCHAR(50)
);
CREATE TABLE table_b (
id SERIAL PRIMARY KEY,
description VARCHAR(50)
);
CREATE TABLE table_b_small (
id SERIAL PRIMARY KEY,
description VARCHAR(50)
);
INSERT INTO table_a (name) VALUES
('Alice'),
('Bob'),
('Charlie');
INSERT INTO table_b_small (description) VALUES
('Description 1'),
('Description 2');
DO $$
BEGIN
FOR i IN 1..10000 LOOP
INSERT INTO table_b (description)
VALUES ('Description ' || i);
END LOOP;
END $$;
```
```
postgres=# select a.name, b.description from table_a as a, table_b as b;
name | description
---------+-------------------
Alice | Description 1
Bob | Description 1
Charlie | Description 1
Alice | Description 2
Bob | Description 2
Charlie | Description 2
postgres=# select a.name, b.description from table_a as a, table_b_small as
b;
name | description
---------+---------------
Alice | Description 1
Alice | Description 2
Bob | Description 1
Bob | Description 2
Charlie | Description 1
Charlie | Description 2
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]