Re: [PR] add `InclusiveProjection` Visitor [iceberg-rust]
marvinlanhenke commented on PR #335: URL: https://github.com/apache/iceberg-rust/pull/335#issuecomment-2066180837 > This looks great! > > I think @marvinlanhenke has a valid concern on when to apply the rewrite-not. Let's defer that discussion when we start wiring everything together. This look good, thanks @sdd for working on this and thanks @marvinlanhenke for the review Thanks @sdd @Fokko ... I think its reasonable to defer the discussion, once we tie everything together. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [PR] add `InclusiveProjection` Visitor [iceberg-rust]
Fokko merged PR #335: URL: https://github.com/apache/iceberg-rust/pull/335 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [PR] add `InclusiveProjection` Visitor [iceberg-rust]
sdd commented on code in PR #335: URL: https://github.com/apache/iceberg-rust/pull/335#discussion_r1571889478 ## crates/iceberg/src/expr/visitors/inclusive_projection.rs: ## @@ -0,0 +1,371 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +use crate::expr::visitors::bound_predicate_visitor::{visit, BoundPredicateVisitor}; +use crate::expr::{BoundPredicate, BoundReference, Predicate}; +use crate::spec::{Datum, PartitionField, PartitionSpecRef}; +use crate::Error; +use fnv::FnvHashSet; +use std::collections::HashMap; +use std::ops::Not; + +pub(crate) struct InclusiveProjection { +partition_spec: PartitionSpecRef, +cached_parts: HashMap>, +} + +impl InclusiveProjection { +pub(crate) fn new(partition_spec: PartitionSpecRef) -> Self { +Self { +partition_spec, +cached_parts: HashMap::new(), +} +} + +fn get_parts_for_field_id( self, field_id: i32) -> { +if let std::collections::hash_map::Entry::Vacant(e) = self.cached_parts.entry(field_id) { +let mut parts: Vec = vec![]; +for partition_spec_field in _spec.fields { +if partition_spec_field.source_id == field_id { +parts.push(partition_spec_field.clone()) +} +} + +e.insert(parts); +} + +_parts[_id] +} + +pub(crate) fn project( self, predicate: ) -> crate::Result { Review Comment: If you do it that way, you invoke `rewrite-not` every time that you call `project`, which could be several times per invocation of `plan_files` in a table scan, and `plan_files` itself can get invoked any number of times on a particular instance of a `TableScan`. In the proposed design `rewrite-not` only ever gets called once, when the `TableScan` gets built, guaranteeing that the predicate is not-free already by the time it gets to the `InclusiveProjection`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [PR] add `InclusiveProjection` Visitor [iceberg-rust]
sdd commented on PR #335: URL: https://github.com/apache/iceberg-rust/pull/335#issuecomment-2065786584 I agree that rewrite-not should get applied. But it is not the responsibility of the `InclusiveProjection` itself. In this design, it happens already earlier on on the process, at the point where the `TableScan` gets built. See here, in my other open PR: https://github.com/apache/iceberg-rust/pull/323/files#diff-bbfbe5e334be6c501ba2ca0ddd84d658ff0f3f84f2b5b532212f4e096a09e09bR76 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [PR] add `InclusiveProjection` Visitor [iceberg-rust]
marvinlanhenke commented on code in PR #335: URL: https://github.com/apache/iceberg-rust/pull/335#discussion_r1570491035 ## crates/iceberg/src/expr/visitors/inclusive_projection.rs: ## @@ -0,0 +1,371 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +use crate::expr::visitors::bound_predicate_visitor::{visit, BoundPredicateVisitor}; +use crate::expr::{BoundPredicate, BoundReference, Predicate}; +use crate::spec::{Datum, PartitionField, PartitionSpecRef}; +use crate::Error; +use fnv::FnvHashSet; +use std::collections::HashMap; +use std::ops::Not; + +pub(crate) struct InclusiveProjection { +partition_spec: PartitionSpecRef, +cached_parts: HashMap>, +} + +impl InclusiveProjection { +pub(crate) fn new(partition_spec: PartitionSpecRef) -> Self { +Self { +partition_spec, +cached_parts: HashMap::new(), +} +} + +fn get_parts_for_field_id( self, field_id: i32) -> { +if let std::collections::hash_map::Entry::Vacant(e) = self.cached_parts.entry(field_id) { +let mut parts: Vec = vec![]; +for partition_spec_field in _spec.fields { +if partition_spec_field.source_id == field_id { +parts.push(partition_spec_field.clone()) +} +} + +e.insert(parts); +} + +_parts[_id] +} + +pub(crate) fn project( self, predicate: ) -> crate::Result { Review Comment: > As per comment below, `rewrite_not` should have already been applied Maybe, I'm wrong here - but from my understanding looking at the python impl the flow is: - create [manifest evaluator](https://github.com/apache/iceberg-python/blob/main/pyiceberg/table/__init__.py#L1631-L1641) - for that we need to create partition_filters -> thus `inclusive_projection` - class `InclusiveProjection` extends `ProjectionEvaluator` which defines the method `project` - in the method `project` we [rewrite-not, bind, and then evaluate (visit)](https://github.com/apache/iceberg-python/blob/main/pyiceberg/expressions/visitors.py#L804-L810) so shouldn't `project` in our `InclusiveProjection` accept a `Predicate` as a param and then before calling `visit` apply rewrite-not and bind? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [PR] add `InclusiveProjection` Visitor [iceberg-rust]
sdd commented on PR #335: URL: https://github.com/apache/iceberg-rust/pull/335#issuecomment-2063172829 FAO @Fokko @marvinlanhenke @liurenjie1024: Comments addressed, ready for re-review -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [PR] add `InclusiveProjection` Visitor [iceberg-rust]
sdd commented on code in PR #335: URL: https://github.com/apache/iceberg-rust/pull/335#discussion_r1570118932 ## crates/iceberg/src/expr/visitors/inclusive_projection.rs: ## @@ -0,0 +1,371 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +use crate::expr::visitors::bound_predicate_visitor::{visit, BoundPredicateVisitor}; +use crate::expr::{BoundPredicate, BoundReference, Predicate}; +use crate::spec::{Datum, PartitionField, PartitionSpecRef}; +use crate::Error; +use fnv::FnvHashSet; +use std::collections::HashMap; +use std::ops::Not; + +pub(crate) struct InclusiveProjection { +partition_spec: PartitionSpecRef, +cached_parts: HashMap>, +} + +impl InclusiveProjection { +pub(crate) fn new(partition_spec: PartitionSpecRef) -> Self { +Self { +partition_spec, +cached_parts: HashMap::new(), +} +} + +fn get_parts_for_field_id( self, field_id: i32) -> { +if let std::collections::hash_map::Entry::Vacant(e) = self.cached_parts.entry(field_id) { +let mut parts: Vec = vec![]; +for partition_spec_field in _spec.fields { +if partition_spec_field.source_id == field_id { +parts.push(partition_spec_field.clone()) +} +} + +e.insert(parts); +} + +_parts[_id] +} + +pub(crate) fn project( self, predicate: ) -> crate::Result { Review Comment: As per comment below, `rewrite_not` should have already been applied by the time we attempt an `InclusiveProjection`. I've updated to `panic` if a Not operator is encountered during projection. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [PR] add `InclusiveProjection` Visitor [iceberg-rust]
sdd commented on code in PR #335: URL: https://github.com/apache/iceberg-rust/pull/335#discussion_r1570116819 ## crates/iceberg/src/expr/visitors/inclusive_projection.rs: ## @@ -0,0 +1,371 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +use crate::expr::visitors::bound_predicate_visitor::{visit, BoundPredicateVisitor}; +use crate::expr::{BoundPredicate, BoundReference, Predicate}; +use crate::spec::{Datum, PartitionField, PartitionSpecRef}; +use crate::Error; +use fnv::FnvHashSet; +use std::collections::HashMap; +use std::ops::Not; + +pub(crate) struct InclusiveProjection { +partition_spec: PartitionSpecRef, +cached_parts: HashMap>, +} + +impl InclusiveProjection { +pub(crate) fn new(partition_spec: PartitionSpecRef) -> Self { +Self { +partition_spec, +cached_parts: HashMap::new(), +} +} + +fn get_parts_for_field_id( self, field_id: i32) -> { +if let std::collections::hash_map::Entry::Vacant(e) = self.cached_parts.entry(field_id) { +let mut parts: Vec = vec![]; +for partition_spec_field in _spec.fields { +if partition_spec_field.source_id == field_id { +parts.push(partition_spec_field.clone()) +} +} + +e.insert(parts); +} + +_parts[_id] +} + +pub(crate) fn project( self, predicate: ) -> crate::Result { +visit(self, predicate) +} + +fn get_parts( + self, +reference: , +predicate: , +) -> Result { +let field_id = reference.field().id; + +// This could be made a bit neater if `try_reduce` ever becomes stable +self.get_parts_for_field_id(field_id) +.iter() +.try_fold(Predicate::AlwaysTrue, |res, part| { +Ok( +if let Some(pred_for_part) = part.transform.project(, predicate)? { +if res == Predicate::AlwaysTrue { +pred_for_part +} else { +res.and(pred_for_part) +} +} else { +res +}, +) +}) +} +} + +impl BoundPredicateVisitor for InclusiveProjection { +type T = Predicate; + +fn always_true( self) -> crate::Result { +Ok(Predicate::AlwaysTrue) +} + +fn always_false( self) -> crate::Result { +Ok(Predicate::AlwaysFalse) +} + +fn and( self, lhs: Self::T, rhs: Self::T) -> crate::Result { +Ok(lhs.and(rhs)) +} + +fn or( self, lhs: Self::T, rhs: Self::T) -> crate::Result { +Ok(lhs.or(rhs)) +} + +fn not( self, inner: Self::T) -> crate::Result { +Ok(inner.not()) Review Comment: Updated to `panic` here with an informative message. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [PR] add `InclusiveProjection` Visitor [iceberg-rust]
sdd commented on code in PR #335: URL: https://github.com/apache/iceberg-rust/pull/335#discussion_r1570115947 ## crates/iceberg/src/expr/visitors/inclusive_projection.rs: ## @@ -0,0 +1,371 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +use crate::expr::visitors::bound_predicate_visitor::{visit, BoundPredicateVisitor}; +use crate::expr::{BoundPredicate, BoundReference, Predicate}; +use crate::spec::{Datum, PartitionField, PartitionSpecRef}; +use crate::Error; +use fnv::FnvHashSet; +use std::collections::HashMap; +use std::ops::Not; + +pub(crate) struct InclusiveProjection { +partition_spec: PartitionSpecRef, +cached_parts: HashMap>, +} + +impl InclusiveProjection { +pub(crate) fn new(partition_spec: PartitionSpecRef) -> Self { +Self { +partition_spec, +cached_parts: HashMap::new(), +} +} + +fn get_parts_for_field_id( self, field_id: i32) -> { +if let std::collections::hash_map::Entry::Vacant(e) = self.cached_parts.entry(field_id) { +let mut parts: Vec = vec![]; +for partition_spec_field in _spec.fields { +if partition_spec_field.source_id == field_id { +parts.push(partition_spec_field.clone()) +} +} + +e.insert(parts); +} + +_parts[_id] +} + +pub(crate) fn project( self, predicate: ) -> crate::Result { +visit(self, predicate) +} + +fn get_parts( + self, +reference: , +predicate: , +) -> Result { +let field_id = reference.field().id; + +// This could be made a bit neater if `try_reduce` ever becomes stable +self.get_parts_for_field_id(field_id) +.iter() +.try_fold(Predicate::AlwaysTrue, |res, part| { +Ok( +if let Some(pred_for_part) = part.transform.project(, predicate)? { +if res == Predicate::AlwaysTrue { +pred_for_part +} else { +res.and(pred_for_part) +} +} else { +res +}, +) +}) +} +} + +impl BoundPredicateVisitor for InclusiveProjection { +type T = Predicate; + +fn always_true( self) -> crate::Result { +Ok(Predicate::AlwaysTrue) +} + +fn always_false( self) -> crate::Result { +Ok(Predicate::AlwaysFalse) +} + +fn and( self, lhs: Self::T, rhs: Self::T) -> crate::Result { +Ok(lhs.and(rhs)) +} + +fn or( self, lhs: Self::T, rhs: Self::T) -> crate::Result { +Ok(lhs.or(rhs)) +} + +fn not( self, inner: Self::T) -> crate::Result { +Ok(inner.not()) +} + +fn is_null( + self, +reference: , +predicate: , +) -> crate::Result { +self.get_parts(reference, predicate) +} + +fn not_null( + self, +reference: , +predicate: , +) -> crate::Result { +self.get_parts(reference, predicate) +} + +fn is_nan( + self, +reference: , +predicate: , +) -> crate::Result { +self.get_parts(reference, predicate) +} + +fn not_nan( + self, +reference: , +predicate: , +) -> crate::Result { +self.get_parts(reference, predicate) +} + +fn less_than( + self, +reference: , +_literal: , +predicate: , +) -> crate::Result { +self.get_parts(reference, predicate) +} + +fn less_than_or_eq( + self, +reference: , +_literal: , +predicate: , +) -> crate::Result { +self.get_parts(reference, predicate) +} + +fn greater_than( + self, +reference: , +_literal: , +predicate: , +) -> crate::Result { +self.get_parts(reference, predicate) +} + +fn greater_than_or_eq( + self, +reference: , +_literal: , +predicate: , +) -> crate::Result { +self.get_parts(reference, predicate) +} + +fn
Re: [PR] add `InclusiveProjection` Visitor [iceberg-rust]
Fokko commented on code in PR #335: URL: https://github.com/apache/iceberg-rust/pull/335#discussion_r1568751884 ## crates/iceberg/src/expr/visitors/inclusive_projection.rs: ## @@ -0,0 +1,371 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +use crate::expr::visitors::bound_predicate_visitor::{visit, BoundPredicateVisitor}; +use crate::expr::{BoundPredicate, BoundReference, Predicate}; +use crate::spec::{Datum, PartitionField, PartitionSpecRef}; +use crate::Error; +use fnv::FnvHashSet; +use std::collections::HashMap; +use std::ops::Not; + +pub(crate) struct InclusiveProjection { +partition_spec: PartitionSpecRef, +cached_parts: HashMap>, +} + +impl InclusiveProjection { +pub(crate) fn new(partition_spec: PartitionSpecRef) -> Self { +Self { +partition_spec, +cached_parts: HashMap::new(), +} +} + +fn get_parts_for_field_id( self, field_id: i32) -> { +if let std::collections::hash_map::Entry::Vacant(e) = self.cached_parts.entry(field_id) { +let mut parts: Vec = vec![]; +for partition_spec_field in _spec.fields { +if partition_spec_field.source_id == field_id { +parts.push(partition_spec_field.clone()) +} +} + +e.insert(parts); +} + +_parts[_id] +} + +pub(crate) fn project( self, predicate: ) -> crate::Result { Review Comment: > However, this implies that the InclusiveProjection operates on Predicate rather than BoundPredicate. This means we have to not only apply rewrite_not but also bind before we call visit here. InclusiveProjection runs on a BoundPredicate. So the order is: rewrite-not, bind, and then evaluate. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [PR] add `InclusiveProjection` Visitor [iceberg-rust]
Fokko commented on code in PR #335: URL: https://github.com/apache/iceberg-rust/pull/335#discussion_r1568763855 ## crates/iceberg/src/expr/visitors/inclusive_projection.rs: ## @@ -0,0 +1,371 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +use crate::expr::visitors::bound_predicate_visitor::{visit, BoundPredicateVisitor}; +use crate::expr::{BoundPredicate, BoundReference, Predicate}; +use crate::spec::{Datum, PartitionField, PartitionSpecRef}; +use crate::Error; +use fnv::FnvHashSet; +use std::collections::HashMap; +use std::ops::Not; + +pub(crate) struct InclusiveProjection { +partition_spec: PartitionSpecRef, +cached_parts: HashMap>, +} + +impl InclusiveProjection { +pub(crate) fn new(partition_spec: PartitionSpecRef) -> Self { +Self { +partition_spec, +cached_parts: HashMap::new(), +} +} + +fn get_parts_for_field_id( self, field_id: i32) -> { +if let std::collections::hash_map::Entry::Vacant(e) = self.cached_parts.entry(field_id) { +let mut parts: Vec = vec![]; +for partition_spec_field in _spec.fields { +if partition_spec_field.source_id == field_id { +parts.push(partition_spec_field.clone()) +} +} + +e.insert(parts); +} + +_parts[_id] +} + +pub(crate) fn project( self, predicate: ) -> crate::Result { +visit(self, predicate) +} + +fn get_parts( + self, +reference: , +predicate: , +) -> Result { +let field_id = reference.field().id; + +// This could be made a bit neater if `try_reduce` ever becomes stable +self.get_parts_for_field_id(field_id) +.iter() +.try_fold(Predicate::AlwaysTrue, |res, part| { +Ok( +if let Some(pred_for_part) = part.transform.project(, predicate)? { +if res == Predicate::AlwaysTrue { +pred_for_part +} else { +res.and(pred_for_part) +} +} else { +res +}, +) +}) +} +} + +impl BoundPredicateVisitor for InclusiveProjection { +type T = Predicate; + +fn always_true( self) -> crate::Result { +Ok(Predicate::AlwaysTrue) +} + +fn always_false( self) -> crate::Result { +Ok(Predicate::AlwaysFalse) +} + +fn and( self, lhs: Self::T, rhs: Self::T) -> crate::Result { +Ok(lhs.and(rhs)) +} + +fn or( self, lhs: Self::T, rhs: Self::T) -> crate::Result { +Ok(lhs.or(rhs)) +} + +fn not( self, inner: Self::T) -> crate::Result { +Ok(inner.not()) +} + +fn is_null( + self, +reference: , +predicate: , +) -> crate::Result { +self.get_parts(reference, predicate) +} + +fn not_null( + self, +reference: , +predicate: , +) -> crate::Result { +self.get_parts(reference, predicate) +} + +fn is_nan( + self, +reference: , +predicate: , +) -> crate::Result { +self.get_parts(reference, predicate) +} + +fn not_nan( + self, +reference: , +predicate: , +) -> crate::Result { +self.get_parts(reference, predicate) +} + +fn less_than( + self, +reference: , +_literal: , +predicate: , +) -> crate::Result { +self.get_parts(reference, predicate) +} + +fn less_than_or_eq( + self, +reference: , +_literal: , +predicate: , +) -> crate::Result { +self.get_parts(reference, predicate) +} + +fn greater_than( + self, +reference: , +_literal: , +predicate: , +) -> crate::Result { +self.get_parts(reference, predicate) +} + +fn greater_than_or_eq( + self, +reference: , +_literal: , +predicate: , +) -> crate::Result { +self.get_parts(reference, predicate) +} + +
Re: [PR] add `InclusiveProjection` Visitor [iceberg-rust]
marvinlanhenke commented on code in PR #335: URL: https://github.com/apache/iceberg-rust/pull/335#discussion_r1568705676 ## crates/iceberg/src/expr/visitors/inclusive_projection.rs: ## @@ -0,0 +1,371 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +use crate::expr::visitors::bound_predicate_visitor::{visit, BoundPredicateVisitor}; +use crate::expr::{BoundPredicate, BoundReference, Predicate}; +use crate::spec::{Datum, PartitionField, PartitionSpecRef}; +use crate::Error; +use fnv::FnvHashSet; +use std::collections::HashMap; +use std::ops::Not; + +pub(crate) struct InclusiveProjection { +partition_spec: PartitionSpecRef, +cached_parts: HashMap>, +} + +impl InclusiveProjection { +pub(crate) fn new(partition_spec: PartitionSpecRef) -> Self { +Self { +partition_spec, +cached_parts: HashMap::new(), +} +} + +fn get_parts_for_field_id( self, field_id: i32) -> { +if let std::collections::hash_map::Entry::Vacant(e) = self.cached_parts.entry(field_id) { +let mut parts: Vec = vec![]; +for partition_spec_field in _spec.fields { +if partition_spec_field.source_id == field_id { +parts.push(partition_spec_field.clone()) +} +} + +e.insert(parts); +} + +_parts[_id] +} + +pub(crate) fn project( self, predicate: ) -> crate::Result { Review Comment: I've taken a closer look and I think we're missing a thing here? I believe we have to rewrite the Predicate with `rewrite_not`, since it's assumed the ExprTree does not contain any NOT nodes. https://github.com/apache/iceberg-python/blob/main/pyiceberg/expressions/visitors.py#L805-L810 However, this implies that the `InclusiveProjection` operates on `Predicate` rather than `BoundPredicate`. This means we have to not only apply `rewrite_not` but also `bind` before we call `visit` here. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [PR] add `InclusiveProjection` Visitor [iceberg-rust]
sdd commented on PR #335: URL: https://github.com/apache/iceberg-rust/pull/335#issuecomment-2059682467 @liurenjie1024 @Fokko @marvinlanhenke @ZENOTME PTAL, I've refactored this on top of the `BoundPredicateVisitor` that was merged a few hours ago and it is ready for review -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [PR] add `InclusiveProjection` Visitor [iceberg-rust]
sdd commented on PR #321: URL: https://github.com/apache/iceberg-rust/pull/321#issuecomment-2059636629 Closing this one in favour of the alternate PR based on the BoundPredicateVisitor design that got merged (https://github.com/apache/iceberg-rust/pull/335) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [PR] add `InclusiveProjection` Visitor [iceberg-rust]
sdd closed pull request #321: add `InclusiveProjection` Visitor URL: https://github.com/apache/iceberg-rust/pull/321 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
Re: [PR] add `InclusiveProjection` Visitor [iceberg-rust]
sdd commented on PR #321: URL: https://github.com/apache/iceberg-rust/pull/321#issuecomment-2041438110 Updated to align with latest `BoundPredicateVisitor` iteration in https://github.com/apache/iceberg-rust/pull/320 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org