josh-fell commented on a change in pull request #18930: URL: https://github.com/apache/airflow/pull/18930#discussion_r730258954
########## File path: airflow/providers/apache/pinot/example_dags/example_pinot_dag.py ########## @@ -0,0 +1,48 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +"""Example DAG demonstrating the usage of the PinotAdminHook and PinotDbApiHook.""" + +from airflow import DAG +from airflow.providers.apache.pinot.hooks.pinot import PinotAdminHook, PinotDbApiHook +from airflow.utils.dates import days_ago + +with DAG( + dag_id='example_pinot_hook', + schedule_interval=None, + start_date=days_ago(2), + tags=['example'], +) as dag: + + # [START howto_operator_pinot_admin_hook] + run_this = PinotAdminHook( + task_id="run_example_pinot_script", + pinot="ls /;", + pinot_options="-x local", + +) + # [END howto_operator_pinot_admin_hook] + + # [START howto_operator_pinot_dbapi_example] + + run_this = PinotDbApiHook( + task_id="run_example_pinot_script", + pinot="ls /;", + pinot_options="-x local", + dag=dag, Review comment: This can be removed since the `DAG` object is being used as a context manager which takes care of passing this arg to all operators. ########## File path: airflow/providers/apache/pinot/example_dags/example_pinot_dag.py ########## @@ -0,0 +1,48 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +"""Example DAG demonstrating the usage of the PinotAdminHook and PinotDbApiHook.""" + +from airflow import DAG +from airflow.providers.apache.pinot.hooks.pinot import PinotAdminHook, PinotDbApiHook +from airflow.utils.dates import days_ago + +with DAG( + dag_id='example_pinot_hook', + schedule_interval=None, + start_date=days_ago(2), + tags=['example'], +) as dag: + + # [START howto_operator_pinot_admin_hook] + run_this = PinotAdminHook( + task_id="run_example_pinot_script", + pinot="ls /;", + pinot_options="-x local", + +) + # [END howto_operator_pinot_admin_hook] + + # [START howto_operator_pinot_dbapi_example] + + run_this = PinotDbApiHook( + task_id="run_example_pinot_script", + pinot="ls /;", + pinot_options="-x local", + dag=dag, Review comment: Would it be possible to show different args for `pinot` and `pinot_options` parameters? Otherwise, seeing as the same args are used for every operator in the DAG, these ideally should be present as `default_args` at the DAG level. However, because the operator documentation is referencing the example DAG, moving the args to `default_args` loses the main context for showcasing the operators in the documentation. Updating the args to be different between the two would preclude the `default_args` use. I'm not sure if they need to be the same value when using these operators together in a pipeline though. ########## File path: docs/apache-airflow-providers-apache-pinot/operators.rst ########## @@ -0,0 +1,75 @@ + .. Licensed to the Apache Software Foundation (ASF) under one + or more contributor license agreements. See the NOTICE file + distributed with this work for additional information + regarding copyright ownership. The ASF licenses this file + to you under the Apache License, Version 2.0 (the + "License"); you may not use this file except in compliance + with the License. You may obtain a copy of the License at + + .. http://www.apache.org/licenses/LICENSE-2.0 + + .. Unless required by applicable law or agreed to in writing, + software distributed under the License is distributed on an + "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + KIND, either express or implied. See the License for the + specific language governing permissions and limitations + under the License. + +``airflow.providers.apache.pinot.hooks.pinot`` +============================================== + +Content +------- + +Apache Pinot Hooks +================== + +`Apache Pinot <https://pinot.apache.org/>`__ is a column-oriented, open-source,distributed data store written in Java. Pinot is designed to execute OLAP querieswith low latency. It is suited in contexts where fast analytics, such as aggregations,are needed on immutable data, possibly, with real-time data ingestion. + +Prerequisite +------------ + +.. To use Pinot hooks, you must configure :doc:`Pinot Connection <connections/pinot>`. + +.. _howto/operator:PinotHooks: + +PinotAdminHook +-------------- + +This hook is a wrapper around the pinot-admin.sh script. For now, only small subset of its subcommands are implemented, which are required to ingest offline data into Apache Pinot (i.e., AddSchema, AddTable, CreateSegment, and UploadSegment). Their command options are based on Pinot v0.1.0. + +Parameters +---------- + +For parameter definition, take a look at :class:`~airflow.providers.apache.pinot.hooks.pinot.PinotAdminHook` + +.. exampleinclude:: /../../airflow/providers/apache/pinot/example_dags/example_pinot_dag.py + :language: python + :start-after: [START howto_operator_pinot_admin_hook] + :end-before: [END howto_operator_pinot_admin_hook] + +Reference +^^^^^^^^^ +For more information,please see the documentation at `Apache Pinot improvements for PinotAdminHook<https://github.com/apache/incubator-pinot/pull/4110>` + +PinotDbApiHook +-------------- + +This hook uses standard-SQL endpoint. PQL endpoint is removed in updated versions of pinotdb library. For more information see `library documentation <https://docs.pinot.apache.org/users/clients/python>` + +Parameters +---------- + +For parameter definition, take a look at :class:`~classairflow.providers.apache.pinot.hooks.pinot.PinotDbApiHook` + +.. exampleinclude:: /../../airflow/providers/apache/pinot/example_dags/example_pinot_dag.py + :language: python + :start-after: [START howto_operator_pinot_dbapi_example] + :end-before: [END howto_operator_pinot_dbapi_example] + +Reference +^^^^^^^^^ + +For more information, please see the documentation at `Pinot documentation on querrying data <https://docs.pinot.apache.org/users/api/querying-pinot-using-standard-sql>` Review comment: "querrying" should be spelled "querying" ########## File path: airflow/providers/apache/pinot/example_dags/example_pinot_dag.py ########## @@ -0,0 +1,48 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +"""Example DAG demonstrating the usage of the PinotAdminHook and PinotDbApiHook.""" + +from airflow import DAG +from airflow.providers.apache.pinot.hooks.pinot import PinotAdminHook, PinotDbApiHook +from airflow.utils.dates import days_ago + +with DAG( + dag_id='example_pinot_hook', + schedule_interval=None, + start_date=days_ago(2), Review comment: We are currently transitioning all of the example DAGs (and later docs) away from using `days_ago(n)` for `start_date` to a static value since it's best practice when writing pipelines. New examples should use a static `start_date` value. The updates have been using `datetime()`; the actual value doesn't matter. ########## File path: docs/apache-airflow-providers-apache-pinot/operators.rst ########## @@ -0,0 +1,75 @@ + .. Licensed to the Apache Software Foundation (ASF) under one + or more contributor license agreements. See the NOTICE file + distributed with this work for additional information + regarding copyright ownership. The ASF licenses this file + to you under the Apache License, Version 2.0 (the + "License"); you may not use this file except in compliance + with the License. You may obtain a copy of the License at + + .. http://www.apache.org/licenses/LICENSE-2.0 + + .. Unless required by applicable law or agreed to in writing, + software distributed under the License is distributed on an + "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + KIND, either express or implied. See the License for the + specific language governing permissions and limitations + under the License. + +``airflow.providers.apache.pinot.hooks.pinot`` +============================================== + +Content +------- + +Apache Pinot Hooks +================== + +`Apache Pinot <https://pinot.apache.org/>`__ is a column-oriented, open-source,distributed data store written in Java. Pinot is designed to execute OLAP querieswith low latency. It is suited in contexts where fast analytics, such as aggregations,are needed on immutable data, possibly, with real-time data ingestion. Review comment: Small nit for spaces: > ... a column-oriented, open-source,distributed data store written in Java "... column-oriented, open-source, distributed data store written in Java" > ... such as aggregations,are needed "... such as aggregations, are needed ... ########## File path: docs/apache-airflow-providers-apache-pinot/operators.rst ########## @@ -0,0 +1,75 @@ + .. Licensed to the Apache Software Foundation (ASF) under one + or more contributor license agreements. See the NOTICE file + distributed with this work for additional information + regarding copyright ownership. The ASF licenses this file + to you under the Apache License, Version 2.0 (the + "License"); you may not use this file except in compliance + with the License. You may obtain a copy of the License at + + .. http://www.apache.org/licenses/LICENSE-2.0 + + .. Unless required by applicable law or agreed to in writing, + software distributed under the License is distributed on an + "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + KIND, either express or implied. See the License for the + specific language governing permissions and limitations + under the License. + +``airflow.providers.apache.pinot.hooks.pinot`` +============================================== + +Content +------- + +Apache Pinot Hooks +================== + +`Apache Pinot <https://pinot.apache.org/>`__ is a column-oriented, open-source,distributed data store written in Java. Pinot is designed to execute OLAP querieswith low latency. It is suited in contexts where fast analytics, such as aggregations,are needed on immutable data, possibly, with real-time data ingestion. + +Prerequisite +------------ + +.. To use Pinot hooks, you must configure :doc:`Pinot Connection <connections/pinot>`. + +.. _howto/operator:PinotHooks: + +PinotAdminHook +-------------- + +This hook is a wrapper around the pinot-admin.sh script. For now, only small subset of its subcommands are implemented, which are required to ingest offline data into Apache Pinot (i.e., AddSchema, AddTable, CreateSegment, and UploadSegment). Their command options are based on Pinot v0.1.0. + +Parameters +---------- + +For parameter definition, take a look at :class:`~airflow.providers.apache.pinot.hooks.pinot.PinotAdminHook` + +.. exampleinclude:: /../../airflow/providers/apache/pinot/example_dags/example_pinot_dag.py + :language: python + :start-after: [START howto_operator_pinot_admin_hook] + :end-before: [END howto_operator_pinot_admin_hook] + +Reference +^^^^^^^^^ +For more information,please see the documentation at `Apache Pinot improvements for PinotAdminHook<https://github.com/apache/incubator-pinot/pull/4110>` Review comment: Another small nit for a space between "information,please" ########## File path: airflow/providers/apache/pinot/example_dags/example_pinot_dag.py ########## @@ -0,0 +1,48 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +"""Example DAG demonstrating the usage of the PinotAdminHook and PinotDbApiHook.""" + +from airflow import DAG +from airflow.providers.apache.pinot.hooks.pinot import PinotAdminHook, PinotDbApiHook +from airflow.utils.dates import days_ago + +with DAG( + dag_id='example_pinot_hook', + schedule_interval=None, + start_date=days_ago(2), + tags=['example'], +) as dag: + + # [START howto_operator_pinot_admin_hook] + run_this = PinotAdminHook( + task_id="run_example_pinot_script", + pinot="ls /;", + pinot_options="-x local", + +) + # [END howto_operator_pinot_admin_hook] + + # [START howto_operator_pinot_dbapi_example] + + run_this = PinotDbApiHook( + task_id="run_example_pinot_script", + pinot="ls /;", + pinot_options="-x local", + dag=dag, Review comment: The `dag=dag` can be removed since the `DAG` object is being used as a context manager which takes care of passing this arg to all operators. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
