Dear all,
I Using FTP Client to download some file dynamically , and the file is csv.
( it is working fine)
And the next step I need to open the files, and read lines

Somebody could help me using the good practices in this approach ?
I using Java > Google DataFlow > apache beam 2.9.0

PCollection<String> fileTransfers= pipeline.apply("Transfer FTP", new
DoFn<FtpInput, String>{

        @ProcessElement

public void processElement(ProcessContext c) {

ArgsOptions opt= c.getPipelineOptions().as(ArgsOptions.class);

FTPClient ftp = new FTPClient();

                  ftp.connect(opt.getFtpHost());

                         ByteArrayOutputStream download = new
ByteArrayOutputStream();

                         boolean result= ftp.retrieveFile(f.getName(),
download);

                         saveCSV(download); // save CSV in Storage Google
cloud

                         c.output("???");

            ...
})
.apply("Read File", TextIO.read().from("")); // This is not correct ...
.apply("Read CSV LINES ", .....);
.appply("Convert to AVRO".....) ;
.apply("Save in AVRO",...);

What I found at Internet is  samples using the easy way:
Start the pipeline with TextIO.read().from("hardcoded path") first.
But I can't find some example in my situations.
Someone already  faced this challenge?

Thanks in Advanced

Reply via email to