[ 
https://issues.apache.org/jira/browse/APEXMALHAR-2034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saumya Mohan updated APEXMALHAR-2034:
-------------------------------------
    Description: 
Issue:
Avro objects are not serialized by Kryo causing the Avro GenericRecord to not 
be available to downstream operators if users don't explicitly mark the stream 
locality at container_local or thread_local. 

Solution:
This JIRA is used to create a Module on top of AvroFileInputOperator and 
AvroToPojo operators such that downstream operators will access POJO instead of 
Avro GenericRecord.  It, therefore, removes the exposure of GenericRecord to 
downstream operators and instead exposes the created POJO to downstream 
operators.

In this Module, the stream between the two encapsulated operators 
(AvroFileInputOperator and AvroToPojo) is set to CONTAINER_LOCAL.

Along with this new module, existing avro support files are moved from contrib 
module to a new 'avro' module.
---------------------------------------------------------------------------------------------------------------------------------------------------------------------
*Unit Test*
Unit test for this Avro module has been added in malhar-avro package.

*Move to new package and Backward compatibility*
- Additionally, this module is part of a new package 'malhar-avro' and the 
operator files/tests are all moved from contrib package to the new package. Old 
operator files are marked deprecated and made to extend from new operator files 
for backward compatibility.
- Creating a new maven module for Avro is in accordance with the JIRA 
https://issues.apache.org/jira/browse/APEXMALHAR-1843.
- Git history of all the moved files is maintained

*Application Level Testing*
- To test the module, I created a sample StreamingApplication and a POJO class. 
This application adds the new AvroToPojoModule, and ConsoleOperator to the DAG. 
ConsoleOperator received and displayed POJO from the module

- To test backward compatibility, I created sample application which adds 
AvroFileInputOperator and AvroToPojo from the old package to the DAG. It also 
adds ConsoleOperator to the DAG. ConsoleOperator received and displayed POJO 
from the module





  was:
Issue:
Avro objects are not serialized by Kryo causing the Avro GenericRecord to not 
be available to downstream operators if users don't explicitly mark the stream 
locality at container_local or thread_local. 

Solution:
This JIRA is used to create a Module on top of AvroFileInputOperator and 
AvroToPojo operators such that downstream operators will access POJO instead of 
Avro GenericRecord.  It, therefore, removes the exposure of GenericRecord to 
downstream operators and instead exposes the created POJO to downstream 
operators.

In this Module, the stream between the two encapsulated operators 
(AvroFileInputOperator and AvroToPojo) is set to CONTAINER_LOCAL.

Along with this new module, existing avro support files are moved from contrib 
module to a new 'avro' module.
---------------------------------------------------------------------------------------------------------------------------------------------------------------------
*Unit Test*
Unit test for this Avro module has been added in malhar-avro package.

*Move to new package and Backward compatibility*
- Additionally, this module is part of a new package 'malhar-avro' and the 
operator files/tests are all moved from contrib package to the new package. Old 
operator files are made to extend from new operator files for backward 
compatibility.
- Creating a new maven module for Avro is in accordance with the JIRA 
https://issues.apache.org/jira/browse/APEXMALHAR-1843.
- Git history of all the moved files is maintained

*Application Level Testing*
- To test the module, I created a sample StreamingApplication and a POJO class. 
This application adds the new AvroToPojoModule, and ConsoleOperator to the DAG. 
ConsoleOperator received and displayed POJO from the module

- To test backward compatibility, I created sample application which adds 
AvroFileInputOperator and AvroToPojo from the old package to the DAG. It also 
adds ConsoleOperator to the DAG. ConsoleOperator received and displayed POJO 
from the module






> Avro File To POJO Module
> ------------------------
>
>                 Key: APEXMALHAR-2034
>                 URL: https://issues.apache.org/jira/browse/APEXMALHAR-2034
>             Project: Apache Apex Malhar
>          Issue Type: New Feature
>            Reporter: devendra tagare
>            Assignee: Saumya Mohan
>
> Issue:
> Avro objects are not serialized by Kryo causing the Avro GenericRecord to not 
> be available to downstream operators if users don't explicitly mark the 
> stream locality at container_local or thread_local. 
> Solution:
> This JIRA is used to create a Module on top of AvroFileInputOperator and 
> AvroToPojo operators such that downstream operators will access POJO instead 
> of Avro GenericRecord.  It, therefore, removes the exposure of GenericRecord 
> to downstream operators and instead exposes the created POJO to downstream 
> operators.
> In this Module, the stream between the two encapsulated operators 
> (AvroFileInputOperator and AvroToPojo) is set to CONTAINER_LOCAL.
> Along with this new module, existing avro support files are moved from 
> contrib module to a new 'avro' module.
> ---------------------------------------------------------------------------------------------------------------------------------------------------------------------
> *Unit Test*
> Unit test for this Avro module has been added in malhar-avro package.
> *Move to new package and Backward compatibility*
> - Additionally, this module is part of a new package 'malhar-avro' and the 
> operator files/tests are all moved from contrib package to the new package. 
> Old operator files are marked deprecated and made to extend from new operator 
> files for backward compatibility.
> - Creating a new maven module for Avro is in accordance with the JIRA 
> https://issues.apache.org/jira/browse/APEXMALHAR-1843.
> - Git history of all the moved files is maintained
> *Application Level Testing*
> - To test the module, I created a sample StreamingApplication and a POJO 
> class. This application adds the new AvroToPojoModule, and ConsoleOperator to 
> the DAG. ConsoleOperator received and displayed POJO from the module
> - To test backward compatibility, I created sample application which adds 
> AvroFileInputOperator and AvroToPojo from the old package to the DAG. It also 
> adds ConsoleOperator to the DAG. ConsoleOperator received and displayed POJO 
> from the module



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to