SemyonSinchenko commented on code in PR #616:
URL: https://github.com/apache/incubator-graphar/pull/616#discussion_r1818024743


##########
cli/src/graphar_cli/config.py:
##########
@@ -0,0 +1,253 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from logging import getLogger
+from pathlib import Path
+from typing import Dict, List, Literal, Optional
+
+from pydantic import BaseModel, field_validator, model_validator
+from typing_extensions import Self
+
+logger = getLogger("graphar_cli")
+
+# TODO: convert to constants

Review Comment:
   I would expect to see these variables in UPPER_CASE



##########
cli/src/graphar_cli/logging.py:
##########
@@ -0,0 +1,40 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+import logging
+from typing import Union

Review Comment:
   I thing it should be moved to the `TYPE_CHECKING` block.



##########
cli/pyproject.toml:
##########
@@ -0,0 +1,57 @@
+[build-system]
+requires = ["scikit-build-core>=0.3.3", "pybind11"]
+build-backend = "scikit_build_core.build"
+
+
+[project]
+name = "graphar_cli"
+version = "0.0.1"
+description = "GraphAr command line tool"
+readme = "README.md"
+authors = [{ name = "GraphAr community", email = "d...@graphar.apache.org" }]
+requires-python = ">=3.7"
+dependencies = ["typer ~= 0.12.3", "pydantic ~= 2.8.2", "pyyaml ~= 6.0.2"]

Review Comment:
   Why did we pin minor versions? As I remember, all these projects are 
following semantic-versioning and there shouldn't be a problem if we pin only 
major version.



##########
cli/src/graphar_cli/config.py:
##########
@@ -0,0 +1,253 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from logging import getLogger
+from pathlib import Path
+from typing import Dict, List, Literal, Optional

Review Comment:
   The most modern way is to add `from __future__ import annotations` and use 
`list` and `dict` instead. See 
[here](https://dev.to/derlin/python-type-hints-and-future-annotations-4kc5#introducing-pep-563-or-what-lies-behind-future-annotations)
 for details.



##########
cli/src/graphar_cli/config.py:
##########
@@ -0,0 +1,253 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from logging import getLogger
+from pathlib import Path
+from typing import Dict, List, Literal, Optional
+
+from pydantic import BaseModel, field_validator, model_validator
+from typing_extensions import Self

Review Comment:
   For performance reasons all the typing parts should be moved to the `if 
TYPE_CHEKING:` block. See 
[here](https://docs.python.org/3/library/typing.html#typing.TYPE_CHECKING) for 
details.



##########
cli/pyproject.toml:
##########
@@ -0,0 +1,57 @@
+[build-system]
+requires = ["scikit-build-core>=0.3.3", "pybind11"]
+build-backend = "scikit_build_core.build"
+
+
+[project]
+name = "graphar_cli"
+version = "0.0.1"
+description = "GraphAr command line tool"
+readme = "README.md"
+authors = [{ name = "GraphAr community", email = "d...@graphar.apache.org" }]
+requires-python = ">=3.7"
+dependencies = ["typer ~= 0.12.3", "pydantic ~= 2.8.2", "pyyaml ~= 6.0.2"]
+
+[project.optional-dependencies]
+test = ["pandas ~= 2.2.2"]
+
+[project.scripts]
+graphar-cli = "graphar_cli.graphar_cli:main"

Review Comment:
   Is a `graphar-cli` a good name? I would expect something like just `graphar`



##########
cli/src/graphar_cli/importer.py:
##########
@@ -0,0 +1,122 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from logging import getLogger
+
+from .config import ImportConfig

Review Comment:
   I thing it can be moved to the `TYPE_CHECKING` block, isn't? I do not see 
runtime calls to this class



##########
cli/src/graphar_cli/config.py:
##########
@@ -0,0 +1,253 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from logging import getLogger
+from pathlib import Path
+from typing import Dict, List, Literal, Optional
+
+from pydantic import BaseModel, field_validator, model_validator
+from typing_extensions import Self
+
+logger = getLogger("graphar_cli")
+
+# TODO: convert to constants
+default_file_type = "parquet"
+default_adj_list_type = "ordered_by_source"
+default_regular_separator = "_"
+default_validate_level = "weak"
+default_version = "gar/v1"
+support_file_types = {"parquet", "orc", "csv", "json"}
+
+
+class GraphArConfig(BaseModel):
+    path: str
+    name: str
+    vertex_chunk_size: Optional[int] = 100
+    edge_chunk_size: Optional[int] = 1024
+    file_type: Literal["parquet", "orc", "csv", "json"] = default_file_type
+    adj_list_type: Literal[
+        "ordered_by_source", "ordered_by_dest", "unordered_by_source", 
"unordered_by_dest"
+    ] = default_adj_list_type
+    validate_level: Literal["no", "weak", "strong"] = default_validate_level
+    version: Optional[str] = default_version
+
+    @field_validator("path")
+    def check_path(cls, v):
+        path = Path(v).resolve().absolute()
+        if not path.exists():
+            path.mkdir(parents=True, exist_ok=True)
+        elif any(path.iterdir()):
+            msg = f"Warning: Path {v} already exists and contains files."
+            logger.warning(msg)
+        return v
+
+
+class Property(BaseModel):

Review Comment:
   Why we re-define all this stuff here instead of generating the code from 
[protobuf](https://github.com/apache/incubator-graphar/blob/main/format/property_group.proto#L29)?
 How are we going to sync these definitions with protobuf in the future?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@graphar.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@graphar.apache.org
For additional commands, e-mail: commits-h...@graphar.apache.org

Reply via email to