[
https://issues.apache.org/jira/browse/HDDS-2443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16969771#comment-16969771
]
Li Cheng commented on HDDS-2443:
--------------------------------
Prototyping with S3 gateway + boto3 now. Reads, Writes and Deletes can be done.
Large object read may need some tweak.
Only concern is when it's only uploading files to S3, it shows read timeout
towards ozone endpoint:
ReadTimeoutError: Read timeout on endpoint URL:
"http://localhost:9878/ozone-test/./20191011/plc_1570784946653_2774"
> Python client/interface for Ozone
> ---------------------------------
>
> Key: HDDS-2443
> URL: https://issues.apache.org/jira/browse/HDDS-2443
> Project: Hadoop Distributed Data Store
> Issue Type: New Feature
> Components: Ozone Client
> Reporter: Li Cheng
> Priority: Major
>
> Original ideas:
> Ozone Client(Python) for Data Science Notebook such as Jupyter.
> # Size: Large
> # PyArrow: [https://pypi.org/project/pyarrow/]
> # Python -> libhdfs HDFS JNI library (HDFS, S3,...) -> Java client API
> Impala uses libhdfs
> # How Jupyter iPython work:
> [https://jupyter.readthedocs.io/en/latest/architecture/how_jupyter_ipython_work.html]
> # Eco,
> Architecture:[https://ipython-books.github.io/chapter-3-mastering-the-jupyter-notebook/]
>
> Path to try:
> 1. s3 interface: Ozone s3 gateway(already supported) + AWS python client
> (boto3)
> 2. python native RPC
> 3. pyarrow + libhdfs, which use the Java client under the hood.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]