Mahieddine Cherif created NIFI-14614:
----------------------------------------
Summary: QueryRecord returns wrong results in some cases since 2.0
Key: NIFI-14614
URL: https://issues.apache.org/jira/browse/NIFI-14614
Project: Apache NiFi
Issue Type: Bug
Affects Versions: 2.4.0, 2.3.0, 2.2.0
Reporter: Mahieddine Cherif
Since NiFi 2.2.0 (at least), the QueryRecord returns wrongs results in some
cases.
*Case 1 (Union not working)*
As an input we have this JSON
{code:java}
[{
"employeeId": "1234",
"firstName": "XXXX",
"lastName": "YYYY",
"email": "[email protected]",
"company": "XYZ",
"businessUnit": "123",
"customReportField1": "1234",
"customReportField2": null,
"customReportField3": null,
"customReportField4": null,
"customReportField5": null,
"customReportField6": null,
"customReportField7": null,
"elearningStartDate": "2024-01-10",
"elearningEndDate": "2025-01-31",
"lastLoginDate": "2024-12-12T11:00:00",
"totalTrainingTime": 182837,
"totalActivitiesCompleted": 489,
"totalWritingLessonsCompleted": 0,
"totalConversationClassesAttended": 3,
"totalIndividualLessonsAssigned": 0,
"totalIndividualLessonsAttended": 0,
"totalIndividualLessonsAbsent": 0,
"totalIndividualLessonsLateCancellation": 0,
"totalIndividualLessonsRemainingScheduled": 0,
"totalIndividualLessonsRemainingUnscheduled": 0,
"arabicTrainingTime": 0,
"arabicActivitiesCompleted": 0,
"arabicLastOverallProficiencyLevel": null,
"arabicLastOverallProficiencyLevelDate": null,
"chineseTrainingTime": 0,
"chineseActivitiesCompleted": 0,
"chineseIndividualLessonsStartDate": null,
"chineseIndividualLessonsEndDate": null,
"chineseIndividualLessonsAssigned": 0,
"chineseIndividualLessonsAttended": 0,
"chineseIndividualLessonsAbsent": 0,
"chineseIndividualLessonsLateCancellation": 0,
"chineseIndividualLessonsRemainingScheduled": 0,
"chineseIndividualLessonsRemainingUnscheduled": 0,
"chineseLastOverallProficiencyLevel": null,
"chineseLastOverallProficiencyLevelDate": null,
"dutchTrainingTime": 0,
"dutchActivitiesCompleted": 0,
"dutchLastOverallProficiencyLevel": null,
"dutchLastOverallProficiencyLevelDate": null,
"englishTrainingTime": 114828,
"englishActivitiesCompleted": 289,
"englishWritingLessonsCompleted": 0,
"englishConversationClassesAttended": 3,
"englishIndividualLessonsStartDate": null,
"englishIndividualLessonsEndDate": null,
"englishIndividualLessonsLastAttendedLessonDate": null,
"englishIndividualLessonsAssigned": 0,
"englishIndividualLessonsAttended": 0,
"englishIndividualLessonsAbsent": 0,
"englishIndividualLessonsLateCancellation": 0,
"englishIndividualLessonsRemainingScheduled": 0,
"englishIndividualLessonsRemainingUnscheduled": 0,
"englishLastOverallProficiencyLevel": "A2.4",
"englishLastOverallProficiencyLevelDate": "2024-09-26T11:35:14.07",
"frenchTrainingTime": 0,
"frenchActivitiesCompleted": 0,
"frenchWritingLessonsCompleted": 0,
"frenchConversationClassesAttended": 0,
"frenchIndividualLessonsStartDate": null,
"frenchIndividualLessonsEndDate": null,
"frenchIndividualLessonsLastAttendedLessonDate": null,
"frenchIndividualLessonsAssigned": 0,
"frenchIndividualLessonsAttended": 0,
"frenchIndividualLessonsAbsent": 0,
"frenchIndividualLessonsLateCancellation": 0,
"frenchIndividualLessonsRemainingScheduled": 0,
"frenchIndividualLessonsRemainingUnscheduled": 0,
"frenchLastOverallProficiencyLevel": null,
"frenchLastOverallProficiencyLevelDate": null,
"germanTrainingTime": 10084,
"germanActivitiesCompleted": 32,
"germanConversationClassesAttended": 0,
"germanIndividualLessonsStartDate": null,
"germanIndividualLessonsEndDate": null,
"germanIndividualLessonsLastAttendedLessonDate": null,
"germanIndividualLessonsAssigned": 0,
"germanIndividualLessonsAttended": 0,
"germanIndividualLessonsAbsent": 0,
"germanIndividualLessonsLateCancellation": 0,
"germanIndividualLessonsRemainingScheduled": 0,
"germanIndividualLessonsRemainingUnscheduled": 0,
"germanLastOverallProficiencyLevel": null,
"germanLastOverallProficiencyLevelDate": null,
"italianTrainingTime": 55,
"italianActivitiesCompleted": 0,
"italianConversationClassesAttended": 0,
"italianIndividualLessonsStartDate": null,
"italianIndividualLessonsEndDate": null,
"italianIndividualLessonsLastAttendedLessonDate": null,
"italianIndividualLessonsAssigned": 0,
"italianIndividualLessonsAttended": 0,
"italianIndividualLessonsAbsent": 0,
"italianIndividualLessonsLateCancellation": 0,
"italianIndividualLessonsRemainingScheduled": 0,
"italianIndividualLessonsRemainingUnscheduled": 0,
"italianLastOverallProficiencyLevel": null,
"italianLastOverallProficiencyLevelDate": null,
"japaneseTrainingTime": 0,
"japaneseActivitiesCompleted": 0,
"japaneseLastOverallProficiencyLevel": null,
"japaneseLastOverallProficiencyLevelDate": null,
"koreanTrainingTime": 0,
"koreanActivitiesCompleted": 0,
"koreanLastOverallProficiencyLevel": null,
"koreanLastOverallProficiencyLevelDate": null,
"portugueseTrainingTime": 0,
"portugueseActivitiesCompleted": 0,
"portugueseConversationClassesAttended": 0,
"portugueseIndividualLessonsStartDate": null,
"portugueseIndividualLessonsEndDate": null,
"portugueseIndividualLessonsLastAttendedLessonDate": null,
"portugueseIndividualLessonsAssigned": 0,
"portugueseIndividualLessonsAttended": 0,
"portugueseIndividualLessonsAbsent": 0,
"portugueseIndividualLessonsLateCancellation": 0,
"portugueseIndividualLessonsRemainingScheduled": 0,
"portugueseIndividualLessonsRemainingUnscheduled": 0,
"portugueseLastOverallProficiencyLevel": null,
"portugueseLastOverallProficiencyLevelDate": null,
"russianTrainingTime": 0,
"russianActivitiesCompleted": 0,
"russianLastOverallProficiencyLevel": null,
"russianLastOverallProficiencyLevelDate": null,
"spanishTrainingTime": 69780,
"spanishActivitiesCompleted": 168,
"spanishConversationClassesAttended": 0,
"spanishIndividualLessonsStartDate": null,
"spanishIndividualLessonsEndDate": null,
"spanishIndividualLessonsLastAttendedLessonDate": null,
"spanishIndividualLessonsAssigned": 0,
"spanishIndividualLessonsAttended": 0,
"spanishIndividualLessonsAbsent": 0,
"spanishIndividualLessonsLateCancellation": 0,
"spanishIndividualLessonsRemainingScheduled": 0,
"spanishIndividualLessonsRemainingUnscheduled": 0,
"spanishLastOverallProficiencyLevel": "A2.3",
"spanishLastOverallProficiencyLevelDate": "2024-05-13T14:26:17.181"
}] {code}
To reproduce this first issue, one can then place a query record with the
following query
{code:java}
(SELECT employeeId, email, elearningStartDate, lastLoginDate,
'en' AS lovCode, 'incomplete' AS trackingStatus,
CAST(englishTrainingTime AS INTEGER) AS timeSpent,
englishLastOverallProficiencyLevel AS proficiencyLevel
FROM FLOWFILE
WHERE CAST(englishTrainingTime AS INTEGER))
UNION ALL
(SELECT employeeId, email, elearningStartDate, lastLoginDate,
'fr' AS lovCode, 'incomplete' AS trackingStatus,
CAST(frenchTrainingTime AS INTEGER) AS timeSpent,
frenchLastOverallProficiencyLevel AS proficiencyLevel
FROM FLOWFILE
WHERE CAST(frenchTrainingTime AS INTEGER) <> 0) {code}
As a reader you can put a jsonreader (with inferred schema) and json writer
with default settings
In the output I have is this
{code:java}
[ {
"employeeId" : "1234",
"email" : "[email protected]",
"elearningStartDate" : "2024-08-25",
"lastLoginDate" : "2024-12-31T06:00:00",
"lovCode" : "en",
"trackingStatus" : "incomplete",
"timeSpent" : 3874,
"proficiencyLevel" : null
}, {
"employeeId" : "1234",
"email" : "[email protected]",
"elearningStartDate" : "2024-08-25",
"lastLoginDate" : "2024-12-31T06:00:00",
"lovCode" : "fr",
"trackingStatus" : "incomplete",
"timeSpent" : 0,
"proficiencyLevel" : null
} ]{code}
instead of
{code:java}
[ {
"candidateLogin" : "1234",
"candidateEmail" : "[email protected]",
"registrationDate" : "2024-11-18",
"lastAccessDate" : "2024-11-18T10:00:00",
"lovCode" : "fr",
"trackingStatus" : "incomplete",
"timeSpent" : 406,
"score" : null
}, {
"candidateLogin" : "1234",
"candidateEmail" : "[email protected]",
"registrationDate" : "2024-03-22",
"lastAccessDate" : "2024-12-31T13:00:00",
"lovCode" : "en",
"trackingStatus" : "incomplete",
"timeSpent" : 114828,
"score" : "B2.1"
}]{code}
it's like the QueryRecord just takes the first record and copy it and displays
it a second time.
There are also some cases where the columns in output are not in the orders of
the select query.
Our queries didn't change between 1.28.0 and 2.4.0 but the output is simply
wrong in some cases now. We thought at the begining that the issue could be
related to the Apache Calcite version used but even with the latest one 1.39.0
used in nifi 2.4.0 the problem remains.
While running the same query on our own internal processor (Similar to the
query record in terms of purpose but not code and uses the same apache calcite
version 1.39.0) doesn't have the issue.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)