Hi Yik San, Could you try `pd.read_csv(‘resources.zip/resources/crypt.csv’, xxx)`?
Regards, Dian > 2021年4月27日 下午4:39,Yik San Chan <evan.chanyik...@gmail.com> 写道: > > Hi, > > My UDF has the dependency to a resource file named crypt.csv that is located > in resources/ directory. > > ```python > # udf_use_resource.py > @udf(input_types=[DataTypes.STRING()], result_type=DataTypes.STRING()) > def decrypt(s): > import pandas as pd > d = pd.read_csv('resources/crypt.csv', header=None, index_col=0, > squeeze=True).to_dict() > return d.get(s, "unknown") > ``` > > I run the job in local mode (i.e., python udf_use_resource.py) without any > problem. However, when I try to run it with > `~/softwares/flink-1.12.0/bin/flink run -d -pyexec > /usr/local/anaconda3/envs/featflow-ml-env/bin/python -pyarch resources.zip > -py udf_use_resource.py` on my local cluster, it complains: > > FileNotFoundError: [Errno 2] File b'resources/crypt.csv' does not exist: > b'resources/crypt.csv' > > The resources.zip is zipped from the resources directory. I wonder: where do > I go wrong? > > Note: udf_use_resource.py and resources/crypt.csv can be found in > https://github.com/YikSanChan/pyflink-quickstart/tree/36bfab4ff830f57d3f23f285c7c5499a03385b71 > > <https://github.com/YikSanChan/pyflink-quickstart/tree/36bfab4ff830f57d3f23f285c7c5499a03385b71>. > > Thanks! > > Best, > Yik San