[ 
https://issues.apache.org/jira/browse/FLINK-1525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14355904#comment-14355904
 ] 

Robert Metzger edited comment on FLINK-1525 at 3/11/15 9:01 AM:
----------------------------------------------------------------

Hi,
great. We're always happy about new contributors.

The idea behind such a tool is to allow users to easily configure and 
parameterize their functions and code.
I think something like this would be really helpful:
{code}
// inArgs := "--input hdfs:///in --output --readers 3 hdfs:///out 
-DignoreTerm=abc -DfilterFactor=0.2"
public static void main(String[] inArgs) throws Exception {
        final ArgsUtil args = new ArgsUtil(inArgs);
        String input = args.getString("input", true); // true for required
        String output = args.getString("output", false, "file:///tmp"); // not 
required with default value.
        int readers = args.getInteger("readers");
        Configuration extParams = args.getParameters();
        // with extParams containing extParams.getString("ignoreTerm"); and the 
other -D arguments.

        DataSet<Tuple2<String, Integer>> counts = text.flatMap(new 
Tokenizer()).withParameters(extParams).map(new TermFilter(args));
{code}
I think the right location for this is the {{flink-contrib}} package.
Also, its very important to write test cases for your code and to add some 
documentation... But I think that can follow after a first working prototype.

Let me know if you need more information on this.


was (Author: rmetzger):
Hi,
great. We're always happy about new contributors.

The idea behind such a tool is to allow users to easily configure and 
parameterize their functions and code.
I think something like this would be really helpful:
{code}
# inArgs := "--input hdfs:///in --output --readers 3 hdfs:///out 
-DignoreTerm=abc -DfilterFactor=0.2"
public static void main(String[] inArgs) throws Exception {
        final ArgsUtil args = new ArgsUtil(inArgs);
        String input = args.getString("input", true); // true for required
        String output = args.getString("output", false, "file:///tmp"); // not 
required with default value.
        int readers = args.getInteger("readers");
        Configuration extParams = args.getParameters();
        // with extParams containing extParams.getString("ignoreTerm"); and the 
other -D arguments.

        DataSet<Tuple2<String, Integer>> counts = text.flatMap(new 
Tokenizer()).withParameters(extParams).map(new TermFilter(args));
{code}
I think the right location for this is the {{flink-contrib}} package.
Also, its very important to write test cases for your code and to add some 
documentation... But I think that can follow after a first working prototype.

Let me know if you need more information on this.

> Provide utils to pass -D parameters to UDFs 
> --------------------------------------------
>
>                 Key: FLINK-1525
>                 URL: https://issues.apache.org/jira/browse/FLINK-1525
>             Project: Flink
>          Issue Type: Improvement
>          Components: flink-contrib
>            Reporter: Robert Metzger
>              Labels: starter
>
> Hadoop users are used to setting job configuration through "-D" on the 
> command line.
> Right now, Flink users have to manually parse command line arguments and pass 
> them to the methods.
> It would be nice to provide a standard args parser with is taking care of 
> such stuff.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to