[GitHub] [arrow-datafusion] tustvold commented on a diff in pull request #4999: Add dictionary_expresions feature (#4386)

2023-01-20 Thread via GitHub


tustvold commented on code in PR #4999:
URL: https://github.com/apache/arrow-datafusion/pull/4999#discussion_r1082911731


##
datafusion/physical-expr/Cargo.toml:
##
@@ -35,12 +35,15 @@ path = "src/lib.rs"
 [features]
 crypto_expressions = ["md-5", "sha2", "blake2", "blake3"]
 default = ["crypto_expressions", "regex_expressions", "unicode_expressions"]

Review Comment:
   > Clearly IOx uses it
   
   That is the point I'm trying to make, IOx **doesn't** use it, at least not 
within any of its tests. A user theoretically could construct a query that 
directly compares dictionary columns, in practice there are extremely limited 
use-cases that come to mind of this.
   
   This feature was only enabled in 
https://github.com/apache/arrow-datafusion/pull/4168 prior to that it was 
disabled



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow-datafusion] tustvold commented on a diff in pull request #4999: Add dictionary_expresions feature (#4386)

2023-01-20 Thread via GitHub


tustvold commented on code in PR #4999:
URL: https://github.com/apache/arrow-datafusion/pull/4999#discussion_r1082911731


##
datafusion/physical-expr/Cargo.toml:
##
@@ -35,12 +35,15 @@ path = "src/lib.rs"
 [features]
 crypto_expressions = ["md-5", "sha2", "blake2", "blake3"]
 default = ["crypto_expressions", "regex_expressions", "unicode_expressions"]

Review Comment:
   > Clearly IOx uses it
   
   That is the point I'm trying to make, IOx **doesn't** use it, at least not 
within any of its tests. A user theoretically could construct a query that 
directly compares dictionary columns, in practice there are extremely limited 
use-cases that come to mind of this



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow-datafusion] tustvold commented on a diff in pull request #4999: Add dictionary_expresions feature (#4386)

2023-01-20 Thread via GitHub


tustvold commented on code in PR #4999:
URL: https://github.com/apache/arrow-datafusion/pull/4999#discussion_r1082911731


##
datafusion/physical-expr/Cargo.toml:
##
@@ -35,12 +35,15 @@ path = "src/lib.rs"
 [features]
 crypto_expressions = ["md-5", "sha2", "blake2", "blake3"]
 default = ["crypto_expressions", "regex_expressions", "unicode_expressions"]

Review Comment:
   > Clearly IOx uses it
   
   That is the point I'm trying to make, IOx **doesn't** use it, at least not 
within any of its tests



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow-datafusion] tustvold commented on a diff in pull request #4999: Add dictionary_expresions feature (#4386)

2023-01-20 Thread GitBox


tustvold commented on code in PR #4999:
URL: https://github.com/apache/arrow-datafusion/pull/4999#discussion_r1082628665


##
datafusion/physical-expr/Cargo.toml:
##
@@ -35,12 +35,15 @@ path = "src/lib.rs"
 [features]
 crypto_expressions = ["md-5", "sha2", "blake2", "blake3"]
 default = ["crypto_expressions", "regex_expressions", "unicode_expressions"]

Review Comment:
   I fairly strongly disagree, it is pretty esoteric. As a data point, none of 
IOx's integration tests require this, and we use dictionaries a LOT :smile: 
   
   It is important to highlight this isn't "dictionary support" but non-scalar, 
binary dictionary kernels which are pretty unusual in practice



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow-datafusion] tustvold commented on a diff in pull request #4999: Add dictionary_expresions feature (#4386)

2023-01-20 Thread GitBox


tustvold commented on code in PR #4999:
URL: https://github.com/apache/arrow-datafusion/pull/4999#discussion_r1082628665


##
datafusion/physical-expr/Cargo.toml:
##
@@ -35,12 +35,15 @@ path = "src/lib.rs"
 [features]
 crypto_expressions = ["md-5", "sha2", "blake2", "blake3"]
 default = ["crypto_expressions", "regex_expressions", "unicode_expressions"]

Review Comment:
   I fairly strongly disagree, it is pretty esoteric. As a data point, none of 
IOx's integration tests require this, and we use dictionaries a LOT :smile: 
   
   It is important to highlight this isn't "dictionary support" but non-scalar, 
binary kernels which are pretty unusual in practice



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow-datafusion] tustvold commented on a diff in pull request #4999: Add dictionary_expresions feature (#4386)

2023-01-20 Thread GitBox


tustvold commented on code in PR #4999:
URL: https://github.com/apache/arrow-datafusion/pull/4999#discussion_r1082628665


##
datafusion/physical-expr/Cargo.toml:
##
@@ -35,12 +35,15 @@ path = "src/lib.rs"
 [features]
 crypto_expressions = ["md-5", "sha2", "blake2", "blake3"]
 default = ["crypto_expressions", "regex_expressions", "unicode_expressions"]

Review Comment:
   I fairly strongly disagree, it is pretty esoteric. As a data point, none of 
IOx's integration tests require this, and we use dictionaries a LOT :smile: 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [arrow-datafusion] tustvold commented on a diff in pull request #4999: Add dictionary_expresions feature (#4386)

2023-01-20 Thread GitBox


tustvold commented on code in PR #4999:
URL: https://github.com/apache/arrow-datafusion/pull/4999#discussion_r1082332500


##
datafusion/core/tests/path_partition.rs:
##
@@ -204,7 +204,7 @@ async fn csv_filter_with_file_col() -> Result<()> {
 );
 
 let result = ctx
-.sql("SELECT c1, c2 FROM t WHERE date='2021-10-27' and date!=c1 LIMIT 
5")
+.sql("SELECT c1, c2 FROM t WHERE date='2021-10-27' and 
c1!='2021-10-27' LIMIT 5")

Review Comment:
   I couldn't see a compelling reason why this test needed to test comparison 
of dictionary columns



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org