Hi Chenghao, How are you formatting your path? For files on hdfs you file path should be something like "hdfs://somehost:8080/downloads/some_dir"
Cheers, Hans On 19 Oct 2023 at 11:00 +0200, Che Chenghao <[email protected]>, wrote: > Hi Bart, > > Thanks for your reply! > > I tried to get filenames from hdfs and write them to log in a pipeline. Here > is the log: > > [root@host hop]# /hop-run.sh -f ./config/projects/default readfromhdfs.hpl > --project default > 2023/10/19 16:23:43 - HopRun - Enabling project 'default' > 2023/10/19 16:23:43 - HopRun - Starting pipeline: ./config/projects/default/ > readfromhdfs.hpl > 2023/10/19 16:23:43 - readfromhdfs - Executing this pipeline using the Local > Pipeline Engine with run configuration 'local' > 2023/10/19 16:23:43 - readfromhdfs - Execution started for pipeline > [readfromhdfs] > 2023/10/19 16:23:43 - Get file names.0 - ERROR: No files found!. > 2023/10/19 16:23:43 - readfromhdfs - Pipeline duration: 0.243 seconds [ > 0.243" ] > HopRun exit. > 2023/10/19 16:23:43 - readfromhdfs - Execution finished on a local pipeline > engine with run configuration 'local' > [root@host hop]# hdfs dfs - ls /hop > Found 1 items > - rw-r- - r- - 1 root supergroup 1754 2023-10-18 16:52 /hop/player. csv > > By the way, I wrote some python code on the other machine to test the HDFS > path and network, and it works well. > > Regards, > Chenghao > > > > On Oct 19, 2023, at 14:29, Bart Maertens <[email protected]> wrote: > > > > Hi Chenghao, > > > > Apache Hop supports HDFS over VFS[1]. However, there may be some system > > configuration that is required. > > Are you able to view and access your hdfs files over CLI tools e.g. `hdfs > > dfs ls <YOUR_PATH>`? > > Are there any logs in the Hop Gui output (console or your pipeline) when > > you try to access files of HDFS? > > > > [1] > > https://hop.apache.org//manual/latest/vfs.html#_apache_hop_vfs_file_systems > > > > Regards, > > Bart > > > > > On Thu, Oct 19, 2023 at 8:25 AM 车 成皓 <[email protected]> wrote: > > > > Hello guys, > > > > > > > > I want to read a CSV file from an HDFS cluster. However, when I added > > > > the HDFS file path in the Text File Input transform and click Show File > > > > Contents, it says that hop GUI cannot find a valid file. > > > > > > > > Is there anything should have been done before I use HDFS as file > > > > system in HOP? > > > > > > > > By the way, I tried other VFS as well, like zip. It works well. It > > > > seems that “hdfs://” is an unknown scheme. > > > > > > > > I’m looking forward to hearing from you. > > > > > > > > Thanks in advance, > > > > Chenghao
