Yes, you should provide input in format: s3n://ukey:upass@bucketName/path -Priyanka
On Thu, Dec 1, 2016 at 12:46 PM, Vishal Agrawal <[email protected] > wrote: > Thank you Priyanka for a quick response. > > I need to use S3 bucket as my source of data. So do I need to give my S3 > bucket path there? > > > Thanks, > Vishal > > > On Thu, Dec 1, 2016 at 1:28 AM, Priyanka Gugale <[email protected]> wrote: > >> Hi Vishal, >> >> The "file" filed helps the operator understand which FileSystem it's >> working with. Check "getFSInstance()" method. Splitter can work with all FS >> supported by hadoop. >> In your case as you have different operator to figure out the input >> file(s), you can provide any one of the known path from your input Source, >> so the Splitter is initialized to work with your filesystem. >> >> -Priyanka >> >> On Thu, Dec 1, 2016 at 11:47 AM, Vishal Agrawal < >> [email protected]> wrote: >> >>> Hi, >>> >>> I am planning to use below Dag configuration. >>> >>> >>> >>> public void populateDAG(DAG dag, Configuration configuration){ >>> >>> DagInput input = dag.addOperator("Input", new DagInput()); >>> >>> FileSplitterBase splitter = dag.addOperator("Splitter", new >>> FileSplitterBase()); >>> >>> FSSliceReader blockReader = dag.addOperator("BlockReader", new >>> FSSliceReader()); >>> >>> dag.addStream("file-info", input.output, splitter.input); >>> >>> dag.addStream("block-metadata", splitter.blocksMetadataOutput, >>> blockReader.blocksMetadataInput); >>> >>> ... >>> >>> } >>> >>> >>> Here DagInput will lookup the source files path and will pass it to >>> FileSplitterBase operator in FileInfo Object. >>> >>> Now as Splitter already has Absolute path of source file in FileInfo >>> Object, I didn’t understand the significance of >>> com.datatorrent.lib.io.fs.FileSplitterBase.file field. >>> >>> >>> Thanks, >>> >>> Vishal >>> >>> >>> >> >
