Hi Bart,

Thanks for your reply!

I tried to get filenames from hdfs and write them to log in a pipeline. Here is 
the log:


[root@host hop]# /hop-run.sh -f ./config/projects/default readfromhdfs.hpl 
--project default

2023/10/19 16:23:43 - HopRun - Enabling project 'default'

2023/10/19 16:23:43 - HopRun - Starting pipeline: ./config/projects/default/ 
readfromhdfs.hpl

2023/10/19 16:23:43 - readfromhdfs - Executing this pipeline using the Local 
Pipeline Engine with run configuration 'local'

2023/10/19 16:23:43 - readfromhdfs - Execution started for pipeline 
[readfromhdfs]

2023/10/19 16:23:43 - Get file names.0 - ERROR: No files found!.

2023/10/19 16:23:43 - readfromhdfs - Pipeline duration: 0.243 seconds [ 0.243" ]

HopRun exit.

2023/10/19 16:23:43 - readfromhdfs - Execution finished on a local pipeline 
engine with run configuration 'local'

[root@host hop]# hdfs dfs - ls /hop

Found 1 items

- rw-r- - r- -  1 root supergroup 1754 2023-10-18 16:52 /hop/player. csv


By the way, I wrote some python code on the other machine to test the HDFS path 
and network, and it works well.


Regards,

Chenghao


On Oct 19, 2023, at 14:29, Bart Maertens <[email protected]> wrote:


Hi Chenghao,

Apache Hop supports HDFS over VFS[1]. However, there may be some system 
configuration that is required.
Are you able to view and access your hdfs files over CLI tools e.g. `hdfs dfs 
ls <YOUR_PATH>`?
Are there any logs in the Hop Gui output (console or your pipeline) when you 
try to access files of HDFS?

[1] https://hop.apache.org//manual/latest/vfs.html#_apache_hop_vfs_file_systems

Regards,
Bart

On Thu, Oct 19, 2023 at 8:25 AM 车 成皓 
<[email protected]<mailto:[email protected]>> wrote:
Hello guys,

I want to read a CSV file from an HDFS cluster. However, when I added the HDFS 
file path in the Text File Input transform and click Show File Contents, it 
says that hop GUI cannot find a valid file.

Is there anything should have been done before I use HDFS as file system in HOP?

By the way, I tried other VFS as well, like zip. It works well. It seems that 
“hdfs://” is an unknown scheme.

I’m looking forward to hearing from you.

Thanks in advance,
Chenghao

Reply via email to