[ 
https://issues.apache.org/jira/browse/NIFI-4457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16194054#comment-16194054
 ] 

Koji Kawamura commented on NIFI-4457:
-------------------------------------

[~mehrdad22] Thanks for the details on how you correct those data. So, the 
'date' column value is a timestamp when your application request (makes a call 
to twitter API)? Then again, I feel using tweet_id as maximum-value column is 
not safe:

{code}
# Added row num for description purpose
#1, 915459071205093382,10/4/2017 9:41:46 AM
#2, 915459072178163714,10/4/2017 9:41:01 AM
{code} 

The #2 is requested and retrieved before #1, but having higher tweet_id than #1.
If my understanding is correct, if QueryDatabaseTable runs at 9:41:10 AM, it 
fetches #2 and set max value as '915459072178163714'. After that, even if #1 
arrives, it will not be able to be fetched.

In that case, I would change the date to represent your database table current 
timestamp, then use the 'date' column as max value column (assuming tweet 
sample rate is reasonable so that a single database can handle).

However, if you can confirm there're records having greater tweet_id then the 
initial.maxvalue in the source table but QueryDatabaseTable doesn't increase 
maximum-value, then that would be an issue in NiFi.

> "Maximum-value" not increasing when "initial.maxvalue" is set and 
> "Maximum-value column" name is different from "id" 
> ---------------------------------------------------------------------------------------------------------------------
>
>                 Key: NIFI-4457
>                 URL: https://issues.apache.org/jira/browse/NIFI-4457
>             Project: Apache NiFi
>          Issue Type: Bug
>          Components: Core Framework
>    Affects Versions: 1.3.0
>         Environment: windows 10
>            Reporter: meh
>         Attachments: Picture1.png, Picture2.png, subquery.csv
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> when "Maximum-value column" name is "id" there is no problem, when i add 
> "initial.maxvalue.id" property in "QueryDatabaseTable" processor, it works 
> well and maxvalue is increasing by every running.
> !Picture1.png|thumbnail!
> but...
> when the "Maximum-value column" name is different from "id" (such as 
> "tweet_id"), after initial processor working, only given 
> "initial.maxvalue.id" is saves and that repeating just same value for every 
> run.
> !Picture2.png|thumbnail!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to