Re: FileSplitterBase Operator

Priyanka Gugale Wed, 30 Nov 2016 22:29:23 -0800

Hi Vishal,

The "file" filed helps the operator understand which FileSystem it's
working with. Check "getFSInstance()" method. Splitter can work with all FS
supported by hadoop.
In your case as you have different operator to figure out the input
file(s), you can provide any one of the known path from your input Source,
so the Splitter is initialized to work with your filesystem.


-Priyanka

On Thu, Dec 1, 2016 at 11:47 AM, Vishal Agrawal <[email protected]
> wrote:

> Hi,
>
> I am planning to use below Dag configuration.
>
>
>
> public void populateDAG(DAG dag, Configuration configuration){
>
>     DagInput input = dag.addOperator("Input", new DagInput());
>
>     FileSplitterBase splitter = dag.addOperator("Splitter", new
> FileSplitterBase());
>
>     FSSliceReader blockReader = dag.addOperator("BlockReader", new
> FSSliceReader());
>
>     dag.addStream("file-info", input.output, splitter.input);
>
>     dag.addStream("block-metadata", splitter.blocksMetadataOutput,
> blockReader.blocksMetadataInput);
>
>     ...
>
>   }
>
>
> Here DagInput will lookup the source files path and will pass it to
> FileSplitterBase operator in FileInfo Object.
>
> Now as Splitter already has Absolute path of source file in FileInfo
> Object, I didn’t understand the significance of com.datatorrent.lib.io.fs.
> FileSplitterBase.file field.
>
>
> Thanks,
>
> Vishal
>
>
>

Re: FileSplitterBase Operator

Reply via email to