Till Rohrmann created FLINK-2692:
------------------------------------

             Summary: Untangle CsvInputFormat into PojoTypeCsvInputFormat and 
TupleTypeCsvInputFormat 
                 Key: FLINK-2692
                 URL: https://issues.apache.org/jira/browse/FLINK-2692
             Project: Flink
          Issue Type: Improvement
            Reporter: Till Rohrmann
            Priority: Minor


The {{CsvInputFormat}} currently allows to return values as a {{Tuple}} or a 
{{Pojo}} type. As a consequence, the processing logic, which has to work for 
both types, is overly complex. For example, the {{CsvInputFormat}} contains 
fields which are only used when a Pojo is returned. Moreover, the pojo field 
information are constructed by calling setter methods which have to be called 
in a very specific order, otherwise they fail. E.g. one first has to call 
{{setFieldTypes}} before calling {{setOrderOfPOJOFields}}, otherwise the number 
of fields might be different. Furthermore, some of the methods can only be 
called if the return type is a {{Pojo}} type, because they expect that a 
{{PojoTypeInfo}} is present.

I think the {{CsvInputFormat}} should be refactored to make the code more 
easily maintainable. I propose to split it up into a {{PojoTypeCsvInputFormat}} 
and a {{TupleTypeCsvInputFormat}} which take all the required information via 
their constructors instead of using the {{setFields}} and 
{{setOrderOfPOJOFields}} approach.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to