[ 
https://issues.apache.org/jira/browse/APEXMALHAR-2368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15777865#comment-15777865
 ] 

Hitesh Kapoor commented on APEXMALHAR-2368:
-------------------------------------------

The root cause of this issue was that variable 'lastEmittedRows' was used 
between threads and its value was not properly synchronised.
An additional issue was discovered that, all the time of the operator was spent 
in garbage collection (GC) (when the the table size was big) and hence it was 
not able to send the heartbeat and hence it was failing.
To fix the problem of GC, we had to restrict the size of the ResultSet 
explicitly and then close the ResultSet whenever it was no longer needed.



> JDBCPollInput operator reads extra records when 1.5M records are added to a 
> blank input table
> ---------------------------------------------------------------------------------------------
>
>                 Key: APEXMALHAR-2368
>                 URL: https://issues.apache.org/jira/browse/APEXMALHAR-2368
>             Project: Apache Apex Malhar
>          Issue Type: Bug
>            Reporter: Hitesh Kapoor
>            Assignee: Hitesh Kapoor
>
> JDBCPollInput operator reads extra records when 1.5 million new records are 
> added to the input table.
> Operator information:
> Operator location: malhar-library
> Available since: 3.5.0
> Operator state: Evolving
> Operator: 
> com.datatorrent.lib.db.jdbc.AbstractJdbcPollInputOperator
> com.datatorrent.lib.db.jdbc.JdbcPOJOPollInputOperator
> Observed only when >=1.5 million records are inserted into the table.
> Not observed with 1million records



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to