Joe Witt created NIFI-12027:
-------------------------------

             Summary: PutDatabaseRecord should invalidate schema cache entries
                 Key: NIFI-12027
                 URL: https://issues.apache.org/jira/browse/NIFI-12027
             Project: Apache NiFi
          Issue Type: Improvement
            Reporter: Joe Witt


On nifi main/2.0 line but confirmed also by a user on 1.15.x line....

If you have a flow of records, such as CSV records, feeding into a 
PutDatabaseRecord and you add columns to the source data things flow normally 
if you have 'ignore new columns/fields' properties set as those new column 
values are just ignored.  However, when you add the new columns present in the 
data to also be in the database you're writing to the values are not sent and 
end up as nulls (if the database allows them).  If you stop/start the 
PutDatabaseRecord processor though then the values start getting set.

This is with the default table schema cache value of 100.  If you set that 
value to 0 then it appears to work fine without restarting.  This suggests that 
our caching default is likely too simplistic.  We should have some mechanism 
whereby schema changes in the database are detected and invalidate any schemas 
we have cached.  Or we do it if we detect a difference in the incoming schema 
of the data.  But the current behavior at least for defaults leaves the user, I 
think, having to choose from the default which likely is 'faster' or a slower 
but more dynamic path.

Perhaps the other question here is how valuable is that schema cache in the 
first place?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to