Re: [PR] Add Struct Accessors to BoundReferences [iceberg-rust]

2024-04-06 Thread via GitHub


liurenjie1024 merged PR #317:
URL: https://github.com/apache/iceberg-rust/pull/317


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Add Struct Accessors to BoundReferences [iceberg-rust]

2024-04-06 Thread via GitHub


sdd commented on PR #317:
URL: https://github.com/apache/iceberg-rust/pull/317#issuecomment-2041050879

   @liurenjie1024 I've added a test as well now for build_accessors :-)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Add Struct Accessors to BoundReferences [iceberg-rust]

2024-04-06 Thread via GitHub


liurenjie1024 commented on code in PR #317:
URL: https://github.com/apache/iceberg-rust/pull/317#discussion_r1554560769


##
crates/iceberg/src/spec/schema.rs:
##
@@ -137,9 +142,55 @@ impl SchemaBuilder {
 name_to_id,
 lowercase_name_to_id,
 id_to_name,
+
+field_id_to_accessor,
 })
 }
 
+fn build_accessors() -> HashMap> {

Review Comment:
   Sorry for misclarification, in fact I mean the `r#struct` method in 
`SchemaVisitor`: 
https://github.com/apache/iceberg-rust/blob/1c2a20b13e67e4f91a44d49fa1b0e3432cd48432/crates/iceberg/src/spec/schema.rs#L324
   
   But I agree that it could be left to do it later.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Add Struct Accessors to BoundReferences [iceberg-rust]

2024-04-06 Thread via GitHub


sdd commented on code in PR #317:
URL: https://github.com/apache/iceberg-rust/pull/317#discussion_r1554542369


##
crates/iceberg/src/spec/schema.rs:
##
@@ -137,9 +142,55 @@ impl SchemaBuilder {
 name_to_id,
 lowercase_name_to_id,
 id_to_name,
+
+field_id_to_accessor,
 })
 }
 
+fn build_accessors() -> HashMap> {

Review Comment:
   Sorry Renjie, I still don't follow! `visit_struct` already exists and has a 
different signature to that? 
   
   
https://github.com/apache/iceberg-rust/blob/4e89ac71c3ea77b9270b71dac52b5c3e50c36526/crates/iceberg/src/spec/schema.rs#L365



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Add Struct Accessors to BoundReferences [iceberg-rust]

2024-04-06 Thread via GitHub


sdd commented on code in PR #317:
URL: https://github.com/apache/iceberg-rust/pull/317#discussion_r1554540831


##
crates/iceberg/src/expr/accessor.rs:
##
@@ -0,0 +1,119 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+use crate::spec::{Datum, Literal, PrimitiveType, Struct};
+use serde_derive::{Deserialize, Serialize};
+use std::sync::Arc;
+
+#[derive(Debug, Serialize, Deserialize, Clone, PartialEq, Eq)]
+enum InnerOrType {
+Inner(Box),
+Type(PrimitiveType),
+}
+
+#[derive(Debug, Serialize, Deserialize, Clone, PartialEq, Eq)]
+pub struct StructAccessor {
+position: usize,
+r#type: PrimitiveType,
+inner: Option>,
+}
+
+pub(crate) type StructAccessorRef = Arc;
+
+impl StructAccessor {
+pub(crate) fn new(position: usize, r#type: PrimitiveType) -> Self {
+StructAccessor {
+position,
+r#type,
+inner: None,
+}
+}
+
+pub(crate) fn wrap(position: usize, inner: Box) -> Self {
+StructAccessor {
+position,
+r#type: inner.r#type().clone(),
+inner: Some(inner),
+}
+}
+
+pub(crate) fn position() -> usize {
+self.position
+}
+
+pub(crate) fn r#type() ->  {
+#type
+}
+
+pub(crate) fn get<'a>(&'a self, container: &'a Struct) -> Datum {
+match  {
+None => {
+let Literal::Primitive(literal) = [self.position] 
else {
+panic!("Expected Literal to be Primitive");

Review Comment:
   Sure, that's much nicer. Done.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Add Struct Accessors to BoundReferences [iceberg-rust]

2024-04-06 Thread via GitHub


sdd commented on code in PR #317:
URL: https://github.com/apache/iceberg-rust/pull/317#discussion_r1554540767


##
crates/iceberg/src/expr/accessor.rs:
##
@@ -0,0 +1,119 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+use crate::spec::{Datum, Literal, PrimitiveType, Struct};
+use serde_derive::{Deserialize, Serialize};
+use std::sync::Arc;
+
+#[derive(Debug, Serialize, Deserialize, Clone, PartialEq, Eq)]
+enum InnerOrType {

Review Comment:
   Whoops! Done.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Add Struct Accessors to BoundReferences [iceberg-rust]

2024-04-05 Thread via GitHub


liurenjie1024 commented on code in PR #317:
URL: https://github.com/apache/iceberg-rust/pull/317#discussion_r1554520971


##
crates/iceberg/src/expr/accessor.rs:
##
@@ -0,0 +1,119 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+use crate::spec::{Datum, Literal, PrimitiveType, Struct};
+use serde_derive::{Deserialize, Serialize};
+use std::sync::Arc;
+
+#[derive(Debug, Serialize, Deserialize, Clone, PartialEq, Eq)]
+enum InnerOrType {
+Inner(Box),
+Type(PrimitiveType),
+}
+
+#[derive(Debug, Serialize, Deserialize, Clone, PartialEq, Eq)]
+pub struct StructAccessor {
+position: usize,
+r#type: PrimitiveType,
+inner: Option>,
+}
+
+pub(crate) type StructAccessorRef = Arc;
+
+impl StructAccessor {
+pub(crate) fn new(position: usize, r#type: PrimitiveType) -> Self {
+StructAccessor {
+position,
+r#type,
+inner: None,
+}
+}
+
+pub(crate) fn wrap(position: usize, inner: Box) -> Self {
+StructAccessor {
+position,
+r#type: inner.r#type().clone(),
+inner: Some(inner),
+}
+}
+
+pub(crate) fn position() -> usize {
+self.position
+}
+
+pub(crate) fn r#type() ->  {
+#type
+}
+
+pub(crate) fn get<'a>(&'a self, container: &'a Struct) -> Datum {
+match  {
+None => {
+let Literal::Primitive(literal) = [self.position] 
else {
+panic!("Expected Literal to be Primitive");
+};
+
+Datum::new(self.r#type().clone(), literal.clone())
+}
+Some(inner) => {
+let Literal::Struct(wrapped) = [self.position] else {
+panic!("Nested accessor should only be wrapping a Struct");

Review Comment:
   Ditto.



##
crates/iceberg/src/spec/schema.rs:
##
@@ -137,9 +142,55 @@ impl SchemaBuilder {
 name_to_id,
 lowercase_name_to_id,
 id_to_name,
+
+field_id_to_accessor,
 })
 }
 
+fn build_accessors() -> HashMap> {

Review Comment:
   > Will do, re SchemaVisitor. Sorry Renjie, not sure what you mean with the 
second comment though?
   
   Sorry, I mean adding some unit test and example code in api to demonstrate 
its usage, like what python does here: 
https://github.com/apache/iceberg-python/blob/d3db8401cd006fcbacf1d9a2502248ce4228ff06/pyiceberg/schema.py#L1162



##
crates/iceberg/src/expr/accessor.rs:
##
@@ -0,0 +1,119 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+use crate::spec::{Datum, Literal, PrimitiveType, Struct};
+use serde_derive::{Deserialize, Serialize};
+use std::sync::Arc;
+
+#[derive(Debug, Serialize, Deserialize, Clone, PartialEq, Eq)]
+enum InnerOrType {

Review Comment:
   Should we remove this?



##
crates/iceberg/src/expr/accessor.rs:
##
@@ -0,0 +1,114 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by 

Re: [PR] Add Struct Accessors to BoundReferences [iceberg-rust]

2024-04-05 Thread via GitHub


sdd commented on code in PR #317:
URL: https://github.com/apache/iceberg-rust/pull/317#discussion_r1554425354


##
crates/iceberg/src/spec/schema.rs:
##
@@ -137,9 +142,55 @@ impl SchemaBuilder {
 name_to_id,
 lowercase_name_to_id,
 id_to_name,
+
+field_id_to_accessor,
 })
 }
 
+fn build_accessors() -> HashMap> {

Review Comment:
   `SchemaVisitor` won't work for this without some modifications. We need to 
know the index within `Schema.fields` for each field to create its accessor, 
and `visit_struct` does not provide this at the moment.
   
   `visit_struct` and `SchemaVisitor` could be modified to pass the field index 
to the visitor's methods. Can we keep this PR as-is and have a TODO so that we 
can move on with this working as-is and I or someone else can pick up this 
refactor in another PR?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Add Struct Accessors to BoundReferences [iceberg-rust]

2024-04-05 Thread via GitHub


sdd commented on code in PR #317:
URL: https://github.com/apache/iceberg-rust/pull/317#discussion_r1554175122


##
crates/iceberg/src/expr/accessor.rs:
##
@@ -0,0 +1,114 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+use crate::spec::{Literal, Struct, Type};
+use serde_derive::{Deserialize, Serialize};
+use std::sync::Arc;
+
+#[derive(Debug, Serialize, Deserialize, Clone, PartialEq, Eq)]
+enum InnerOrType {
+Inner(Arc),
+Type(Type),
+}
+
+#[derive(Debug, Serialize, Deserialize, Clone, PartialEq, Eq)]
+pub struct StructAccessor {
+position: i32,
+inner_or_type: InnerOrType,
+}
+
+pub(crate) type StructAccessorRef = Arc;
+
+impl StructAccessor {
+pub(crate) fn new(position: i32, r#type: Type) -> Self {
+StructAccessor {
+position,
+inner_or_type: InnerOrType::Type(r#type),
+}
+}
+
+pub(crate) fn wrap(position: i32, inner: StructAccessorRef) -> Self {
+StructAccessor {
+position,
+inner_or_type: InnerOrType::Inner(inner),
+}
+}
+
+pub fn position() -> i32 {
+self.position
+}
+
+fn r#type() ->  {
+match _or_type {
+InnerOrType::Inner(inner) => inner.r#type(),
+InnerOrType::Type(r#type) => r#type,
+}
+}
+
+fn get<'a>(&'a self, container: &'a Struct) ->  {

Review Comment:
   Ok, will update



##
crates/iceberg/src/spec/values.rs:
##
@@ -1141,6 +1141,11 @@ impl Struct {
 },
 )
 }
+
+/// Gets a ref to the field at `position` within the `Struct`
+pub fn get(, position: i32) ->  {

Review Comment:
   Good idea! Will do



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Add Struct Accessors to BoundReferences [iceberg-rust]

2024-04-05 Thread via GitHub


sdd commented on code in PR #317:
URL: https://github.com/apache/iceberg-rust/pull/317#discussion_r1554176015


##
crates/iceberg/src/spec/schema.rs:
##
@@ -137,9 +142,55 @@ impl SchemaBuilder {
 name_to_id,
 lowercase_name_to_id,
 id_to_name,
+
+field_id_to_accessor,
 })
 }
 
+fn build_accessors() -> HashMap> {

Review Comment:
   Will do, re SchemaVisitor. Sorry Renjie, not sure what you mean with the 
second comment though?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Add Struct Accessors to BoundReferences [iceberg-rust]

2024-04-05 Thread via GitHub


sdd commented on code in PR #317:
URL: https://github.com/apache/iceberg-rust/pull/317#discussion_r1554174651


##
crates/iceberg/src/expr/accessor.rs:
##
@@ -0,0 +1,114 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+use crate::spec::{Literal, Struct, Type};
+use serde_derive::{Deserialize, Serialize};
+use std::sync::Arc;
+
+#[derive(Debug, Serialize, Deserialize, Clone, PartialEq, Eq)]
+enum InnerOrType {
+Inner(Arc),

Review Comment:
   Wil do!



##
crates/iceberg/src/expr/accessor.rs:
##
@@ -0,0 +1,114 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+use crate::spec::{Literal, Struct, Type};
+use serde_derive::{Deserialize, Serialize};
+use std::sync::Arc;
+
+#[derive(Debug, Serialize, Deserialize, Clone, PartialEq, Eq)]
+enum InnerOrType {
+Inner(Arc),

Review Comment:
   Will do!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Add Struct Accessors to BoundReferences [iceberg-rust]

2024-04-05 Thread via GitHub


sdd commented on code in PR #317:
URL: https://github.com/apache/iceberg-rust/pull/317#discussion_r1554171392


##
crates/iceberg/src/expr/accessor.rs:
##
@@ -0,0 +1,114 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+use crate::spec::{Literal, Struct, Type};
+use serde_derive::{Deserialize, Serialize};
+use std::sync::Arc;
+
+#[derive(Debug, Serialize, Deserialize, Clone, PartialEq, Eq)]
+enum InnerOrType {
+Inner(Arc),
+Type(Type),
+}
+
+#[derive(Debug, Serialize, Deserialize, Clone, PartialEq, Eq)]
+pub struct StructAccessor {

Review Comment:
   This was deliberate - in the Java implementation, `Accessor` is an interface 
and generic over `T`, with T usually being `StructLike`, and `StructLike` is 
itself an interface. Since we only have `Struct` at the moment , rather than 
several structs that implement a `StructLike` trait, I went for 
`StructAccessor`, since ours is closer to `Accessor` in the Java 
implementation, and this allows us to implement a more generic `Accessor` at 
some point in the future alongside `StructAccessor`. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Add Struct Accessors to BoundReferences [iceberg-rust]

2024-04-05 Thread via GitHub


sdd commented on code in PR #317:
URL: https://github.com/apache/iceberg-rust/pull/317#discussion_r1554163204


##
crates/iceberg/src/expr/accessor.rs:
##
@@ -0,0 +1,114 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+use crate::spec::{Literal, Struct, Type};
+use serde_derive::{Deserialize, Serialize};
+use std::sync::Arc;
+
+#[derive(Debug, Serialize, Deserialize, Clone, PartialEq, Eq)]
+enum InnerOrType {
+Inner(Arc),
+Type(Type),
+}
+
+#[derive(Debug, Serialize, Deserialize, Clone, PartialEq, Eq)]
+pub struct StructAccessor {
+position: i32,
+inner_or_type: InnerOrType,

Review Comment:
   That duplicates the type at each level, but you have a point, we're 
optimising for time efficiency rather than space. Will fix



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Add Struct Accessors to BoundReferences [iceberg-rust]

2024-04-05 Thread via GitHub


sdd commented on code in PR #317:
URL: https://github.com/apache/iceberg-rust/pull/317#discussion_r1554161905


##
crates/iceberg/src/expr/accessor.rs:
##
@@ -0,0 +1,114 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+use crate::spec::{Literal, Struct, Type};
+use serde_derive::{Deserialize, Serialize};
+use std::sync::Arc;
+
+#[derive(Debug, Serialize, Deserialize, Clone, PartialEq, Eq)]
+enum InnerOrType {
+Inner(Arc),
+Type(Type),
+}
+
+#[derive(Debug, Serialize, Deserialize, Clone, PartialEq, Eq)]
+pub struct StructAccessor {
+position: i32,

Review Comment:
   Sure, will change. I saw we'd used `i32` almost everywhere else so went with 
that but `usize` works better 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Add Struct Accessors to BoundReferences [iceberg-rust]

2024-04-05 Thread via GitHub


marvinlanhenke commented on code in PR #317:
URL: https://github.com/apache/iceberg-rust/pull/317#discussion_r1553285045


##
crates/iceberg/src/expr/accessor.rs:
##
@@ -0,0 +1,114 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+use crate::spec::{Literal, Struct, Type};
+use serde_derive::{Deserialize, Serialize};
+use std::sync::Arc;
+
+#[derive(Debug, Serialize, Deserialize, Clone, PartialEq, Eq)]
+enum InnerOrType {
+Inner(Arc),
+Type(Type),
+}
+
+#[derive(Debug, Serialize, Deserialize, Clone, PartialEq, Eq)]
+pub struct StructAccessor {
+position: i32,
+inner_or_type: InnerOrType,
+}
+
+pub(crate) type StructAccessorRef = Arc;
+
+impl StructAccessor {
+pub(crate) fn new(position: i32, r#type: Type) -> Self {
+StructAccessor {
+position,
+inner_or_type: InnerOrType::Type(r#type),
+}
+}
+
+pub(crate) fn wrap(position: i32, inner: StructAccessorRef) -> Self {
+StructAccessor {
+position,
+inner_or_type: InnerOrType::Inner(inner),
+}
+}
+
+pub fn position() -> i32 {
+self.position
+}
+
+fn r#type() ->  {
+match _or_type {
+InnerOrType::Inner(inner) => inner.r#type(),
+InnerOrType::Type(r#type) => r#type,
+}
+}
+
+fn get<'a>(&'a self, container: &'a Struct) ->  {

Review Comment:
   > Should we return `Datum` here? This better follows python/java 
implementation.
   
   unless we explicitly need `Literal` somewhere downstream (haven't checked 
python/java yet). I'd definitely +1 for returning `Datum` here, which should be 
easier to handle downstream.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Add Struct Accessors to BoundReferences [iceberg-rust]

2024-04-05 Thread via GitHub


liurenjie1024 commented on code in PR #317:
URL: https://github.com/apache/iceberg-rust/pull/317#discussion_r1553151792


##
crates/iceberg/src/expr/accessor.rs:
##
@@ -0,0 +1,114 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+use crate::spec::{Literal, Struct, Type};
+use serde_derive::{Deserialize, Serialize};
+use std::sync::Arc;
+
+#[derive(Debug, Serialize, Deserialize, Clone, PartialEq, Eq)]
+enum InnerOrType {
+Inner(Arc),
+Type(Type),
+}
+
+#[derive(Debug, Serialize, Deserialize, Clone, PartialEq, Eq)]
+pub struct StructAccessor {
+position: i32,
+inner_or_type: InnerOrType,

Review Comment:
   This seems inefficient to me, each time we call `r#type`, it needs to 
recursively call inners for nested field. How about just following:
   ```rust
   r#type: Type,
   inner: Option
   ```



##
crates/iceberg/src/spec/schema.rs:
##
@@ -137,9 +142,55 @@ impl SchemaBuilder {
 name_to_id,
 lowercase_name_to_id,
 id_to_name,
+
+field_id_to_accessor,
 })
 }
 
+fn build_accessors() -> HashMap> {

Review Comment:
   How about using SchemaVisitor to build this?



##
crates/iceberg/src/expr/accessor.rs:
##
@@ -0,0 +1,114 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+use crate::spec::{Literal, Struct, Type};
+use serde_derive::{Deserialize, Serialize};
+use std::sync::Arc;
+
+#[derive(Debug, Serialize, Deserialize, Clone, PartialEq, Eq)]
+enum InnerOrType {
+Inner(Arc),
+Type(Type),
+}
+
+#[derive(Debug, Serialize, Deserialize, Clone, PartialEq, Eq)]
+pub struct StructAccessor {

Review Comment:
   How about just name it `Accessor`? This is in fact field accessor.



##
crates/iceberg/src/expr/accessor.rs:
##
@@ -0,0 +1,114 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+use crate::spec::{Literal, Struct, Type};
+use serde_derive::{Deserialize, Serialize};
+use std::sync::Arc;
+
+#[derive(Debug, Serialize, Deserialize, Clone, PartialEq, Eq)]
+enum InnerOrType {
+Inner(Arc),

Review Comment:
   ```suggestion
   Inner(Box),
   ```
   I think the inner accessor is not shared, so it would be more efficient to 
use `Box`



##
crates/iceberg/src/spec/schema.rs:
##
@@ -137,9 +142,55 @@ impl SchemaBuilder {
 name_to_id,
 lowercase_name_to_id,
 id_to_name,
+
+field_id_to_accessor,
 })
 }
 
+fn build_accessors() -> HashMap> {

Review Comment:
   Also it would be better to add some uts to show the result.



##
crates/iceberg/src/spec/values.rs:
##
@@ -1141,6 +1141,11 @@ impl Struct {
   

Re: [PR] Add Struct Accessors to BoundReferences [iceberg-rust]

2024-04-04 Thread via GitHub


sdd commented on PR #317:
URL: https://github.com/apache/iceberg-rust/pull/317#issuecomment-2038350682

   PTAL @liurenjie1024 and @marvinlanhenke - extracted from 
https://github.com/apache/iceberg-rust/pull/241 and added tests


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org