alamb commented on code in PR #18946: URL: https://github.com/apache/datafusion/pull/18946#discussion_r2589513456
########## datafusion-examples/examples/builtin_functions/regexp.rs: ########## @@ -32,12 +35,30 @@ use datafusion::prelude::*; /// https://docs.rs/regex/latest/regex/#grouping-and-flags pub async fn regexp() -> Result<()> { let ctx = SessionContext::new(); - ctx.register_csv( - "examples", - "datafusion/physical-expr/tests/data/regex.csv", - CsvReadOptions::new(), - ) - .await?; + // content from file 'datafusion/physical-expr/tests/data/regex.csv' + let csv_data = r#"values,patterns,replacement,flags +abc,^(a),bb\1bb,i Review Comment: why inline this content? It is fine, I am just curious ########## ci/scripts/rust_example.sh: ########## @@ -25,12 +25,26 @@ export CARGO_PROFILE_CI_STRIP=true cd datafusion-examples/examples/ cargo build --profile ci --examples -files=$(ls .) -for filename in $files -do - example_name=`basename $filename ".rs"` - # Skip tests that rely on external storage and flight - if [ ! -d $filename ]; then - cargo run --profile ci --example $example_name - fi +SKIP_LIST=("external_dependency" "flight" "ffi") + +skip_example() { + local name="$1" + for skip in "${SKIP_LIST[@]}"; do + if [ "$name" = "$skip" ]; then + return 0 + fi + done + return 1 +} + +for dir in */; do + example_name=$(basename "$dir") + + if skip_example "$example_name"; then + echo "Skipping $example_name" + continue + fi + + echo "Running example group: $example_name" Review Comment: When I ran this script twice, I got an error the second time around: ```shell ./ci/scripts/rust_example.sh ./ci/scripts/rust_example.sh ``` The second run made this: ``` Running example: deserialize_to_struct Running example group: datafusion-examples error: no example target named `datafusion-examples` in default-run packages help: available example targets: builtin_functions custom_data_source data_io dataframe execution_monitoring external_dependency flight proto query_planning sql_ops udf ``` ########## datafusion-examples/examples/builtin_functions/main.rs: ########## @@ -67,12 +71,38 @@ impl FromStr for ExampleKind { } impl ExampleKind { - const ALL: [Self; 3] = [Self::DateTime, Self::FunctionFactory, Self::Regexp]; + const ALL_VARIANTS: [Self; 4] = [ Review Comment: When looking at the amount of boiler plate code, I think we can use strum to do the same thing https://crates.io/crates/strum I know in general adding a new dependency is something we try to avoid, but given strum is [already in the workspace](https://github.com/apache/datafusion/blob/f22a3f3955e667605c0ccbfd6e216f91f4f134ee/Cargo.lock#L6013-L6017), using it in examples seems reasonable to me Specifically, - https://docs.rs/strum_macros/latest/strum_macros/derive.EnumIter.html - https://docs.rs/strum_macros/latest/strum_macros/derive.EnumString.html -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
