Hi Chenghao,

How are you formatting your path? For files on hdfs you file path should be 
something like "hdfs://somehost:8080/downloads/some_dir"


Cheers,
Hans
On 19 Oct 2023 at 11:00 +0200, Che Chenghao <[email protected]>, wrote:
> Hi Bart,
>
> Thanks for your reply!
>
> I tried to get filenames from hdfs and write them to log in a pipeline. Here 
> is the log:
>
> [root@host hop]# /hop-run.sh -f ./config/projects/default readfromhdfs.hpl 
> --project default
> 2023/10/19 16:23:43 - HopRun - Enabling project 'default'
> 2023/10/19 16:23:43 - HopRun - Starting pipeline: ./config/projects/default/ 
> readfromhdfs.hpl
> 2023/10/19 16:23:43 - readfromhdfs - Executing this pipeline using the Local 
> Pipeline Engine with run configuration 'local'
> 2023/10/19 16:23:43 - readfromhdfs - Execution started for pipeline 
> [readfromhdfs]
> 2023/10/19 16:23:43 - Get file names.0 - ERROR: No files found!.
> 2023/10/19 16:23:43 - readfromhdfs - Pipeline duration: 0.243 seconds [ 
> 0.243" ]
> HopRun exit.
> 2023/10/19 16:23:43 - readfromhdfs - Execution finished on a local pipeline 
> engine with run configuration 'local'
> [root@host hop]# hdfs dfs - ls /hop
> Found 1 items
> - rw-r- - r- -  1 root supergroup 1754 2023-10-18 16:52 /hop/player. csv
>
> By the way, I wrote some python code on the other machine to test the HDFS 
> path and network, and it works well.
>
> Regards,
> Chenghao
>
>
> > On Oct 19, 2023, at 14:29, Bart Maertens <[email protected]> wrote:
> >
> > Hi Chenghao,
> >
> > Apache Hop supports HDFS over VFS[1]. However, there may be some system 
> > configuration that is required.
> > Are you able to view and access your hdfs files over CLI tools e.g. `hdfs 
> > dfs ls <YOUR_PATH>`?
> > Are there any logs in the Hop Gui output (console or your pipeline) when 
> > you try to access files of HDFS?
> >
> > [1] 
> > https://hop.apache.org//manual/latest/vfs.html#_apache_hop_vfs_file_systems
> >
> > Regards,
> > Bart
> >
> > > On Thu, Oct 19, 2023 at 8:25 AM 车 成皓 <[email protected]> wrote:
> > > > Hello guys,
> > > >
> > > > I want to read a CSV file from an HDFS cluster. However, when I added 
> > > > the HDFS file path in the Text File Input transform and click Show File 
> > > > Contents, it says that hop GUI cannot find a valid file.
> > > >
> > > > Is there anything should have been done before I use HDFS as file 
> > > > system in HOP?
> > > >
> > > > By the way, I tried other VFS as well, like zip. It works well. It 
> > > > seems that “hdfs://” is an unknown scheme.
> > > >
> > > > I’m looking forward to hearing from you.
> > > >
> > > > Thanks in advance,
> > > > Chenghao

Reply via email to