Mahieddine Cherif created NIFI-14614:
----------------------------------------

             Summary: QueryRecord returns wrong results in some cases since 2.0
                 Key: NIFI-14614
                 URL: https://issues.apache.org/jira/browse/NIFI-14614
             Project: Apache NiFi
          Issue Type: Bug
    Affects Versions: 2.4.0, 2.3.0, 2.2.0
            Reporter: Mahieddine Cherif


Since NiFi 2.2.0 (at least), the QueryRecord returns wrongs results in some 
cases.

*Case 1 (Union not working)* 
As an input we have this JSON
{code:java}
[{
    "employeeId": "1234",
    "firstName": "XXXX",
    "lastName": "YYYY",
    "email": "[email protected]",
    "company": "XYZ",
    "businessUnit": "123",
    "customReportField1": "1234",
    "customReportField2": null,
    "customReportField3": null,
    "customReportField4": null,
    "customReportField5": null,
    "customReportField6": null,
    "customReportField7": null,
    "elearningStartDate": "2024-01-10",
    "elearningEndDate": "2025-01-31",
    "lastLoginDate": "2024-12-12T11:00:00",
    "totalTrainingTime": 182837,
    "totalActivitiesCompleted": 489,
    "totalWritingLessonsCompleted": 0,
    "totalConversationClassesAttended": 3,
    "totalIndividualLessonsAssigned": 0,
    "totalIndividualLessonsAttended": 0,
    "totalIndividualLessonsAbsent": 0,
    "totalIndividualLessonsLateCancellation": 0,
    "totalIndividualLessonsRemainingScheduled": 0,
    "totalIndividualLessonsRemainingUnscheduled": 0,
    "arabicTrainingTime": 0,
    "arabicActivitiesCompleted": 0,
    "arabicLastOverallProficiencyLevel": null,
    "arabicLastOverallProficiencyLevelDate": null,
    "chineseTrainingTime": 0,
    "chineseActivitiesCompleted": 0,
    "chineseIndividualLessonsStartDate": null,
    "chineseIndividualLessonsEndDate": null,
    "chineseIndividualLessonsAssigned": 0,
    "chineseIndividualLessonsAttended": 0,
    "chineseIndividualLessonsAbsent": 0,
    "chineseIndividualLessonsLateCancellation": 0,
    "chineseIndividualLessonsRemainingScheduled": 0,
    "chineseIndividualLessonsRemainingUnscheduled": 0,
    "chineseLastOverallProficiencyLevel": null,
    "chineseLastOverallProficiencyLevelDate": null,
    "dutchTrainingTime": 0,
    "dutchActivitiesCompleted": 0,
    "dutchLastOverallProficiencyLevel": null,
    "dutchLastOverallProficiencyLevelDate": null,
    "englishTrainingTime": 114828,
    "englishActivitiesCompleted": 289,
    "englishWritingLessonsCompleted": 0,
    "englishConversationClassesAttended": 3,
    "englishIndividualLessonsStartDate": null,
    "englishIndividualLessonsEndDate": null,
    "englishIndividualLessonsLastAttendedLessonDate": null,
    "englishIndividualLessonsAssigned": 0,
    "englishIndividualLessonsAttended": 0,
    "englishIndividualLessonsAbsent": 0,
    "englishIndividualLessonsLateCancellation": 0,
    "englishIndividualLessonsRemainingScheduled": 0,
    "englishIndividualLessonsRemainingUnscheduled": 0,
    "englishLastOverallProficiencyLevel": "A2.4",
    "englishLastOverallProficiencyLevelDate": "2024-09-26T11:35:14.07",
    "frenchTrainingTime": 0,
    "frenchActivitiesCompleted": 0,
    "frenchWritingLessonsCompleted": 0,
    "frenchConversationClassesAttended": 0,
    "frenchIndividualLessonsStartDate": null,
    "frenchIndividualLessonsEndDate": null,
    "frenchIndividualLessonsLastAttendedLessonDate": null,
    "frenchIndividualLessonsAssigned": 0,
    "frenchIndividualLessonsAttended": 0,
    "frenchIndividualLessonsAbsent": 0,
    "frenchIndividualLessonsLateCancellation": 0,
    "frenchIndividualLessonsRemainingScheduled": 0,
    "frenchIndividualLessonsRemainingUnscheduled": 0,
    "frenchLastOverallProficiencyLevel": null,
    "frenchLastOverallProficiencyLevelDate": null,
    "germanTrainingTime": 10084,
    "germanActivitiesCompleted": 32,
    "germanConversationClassesAttended": 0,
    "germanIndividualLessonsStartDate": null,
    "germanIndividualLessonsEndDate": null,
    "germanIndividualLessonsLastAttendedLessonDate": null,
    "germanIndividualLessonsAssigned": 0,
    "germanIndividualLessonsAttended": 0,
    "germanIndividualLessonsAbsent": 0,
    "germanIndividualLessonsLateCancellation": 0,
    "germanIndividualLessonsRemainingScheduled": 0,
    "germanIndividualLessonsRemainingUnscheduled": 0,
    "germanLastOverallProficiencyLevel": null,
    "germanLastOverallProficiencyLevelDate": null,
    "italianTrainingTime": 55,
    "italianActivitiesCompleted": 0,
    "italianConversationClassesAttended": 0,
    "italianIndividualLessonsStartDate": null,
    "italianIndividualLessonsEndDate": null,
    "italianIndividualLessonsLastAttendedLessonDate": null,
    "italianIndividualLessonsAssigned": 0,
    "italianIndividualLessonsAttended": 0,
    "italianIndividualLessonsAbsent": 0,
    "italianIndividualLessonsLateCancellation": 0,
    "italianIndividualLessonsRemainingScheduled": 0,
    "italianIndividualLessonsRemainingUnscheduled": 0,
    "italianLastOverallProficiencyLevel": null,
    "italianLastOverallProficiencyLevelDate": null,
    "japaneseTrainingTime": 0,
    "japaneseActivitiesCompleted": 0,
    "japaneseLastOverallProficiencyLevel": null,
    "japaneseLastOverallProficiencyLevelDate": null,
    "koreanTrainingTime": 0,
    "koreanActivitiesCompleted": 0,
    "koreanLastOverallProficiencyLevel": null,
    "koreanLastOverallProficiencyLevelDate": null,
    "portugueseTrainingTime": 0,
    "portugueseActivitiesCompleted": 0,
    "portugueseConversationClassesAttended": 0,
    "portugueseIndividualLessonsStartDate": null,
    "portugueseIndividualLessonsEndDate": null,
    "portugueseIndividualLessonsLastAttendedLessonDate": null,
    "portugueseIndividualLessonsAssigned": 0,
    "portugueseIndividualLessonsAttended": 0,
    "portugueseIndividualLessonsAbsent": 0,
    "portugueseIndividualLessonsLateCancellation": 0,
    "portugueseIndividualLessonsRemainingScheduled": 0,
    "portugueseIndividualLessonsRemainingUnscheduled": 0,
    "portugueseLastOverallProficiencyLevel": null,
    "portugueseLastOverallProficiencyLevelDate": null,
    "russianTrainingTime": 0,
    "russianActivitiesCompleted": 0,
    "russianLastOverallProficiencyLevel": null,
    "russianLastOverallProficiencyLevelDate": null,
    "spanishTrainingTime": 69780,
    "spanishActivitiesCompleted": 168,
    "spanishConversationClassesAttended": 0,
    "spanishIndividualLessonsStartDate": null,
    "spanishIndividualLessonsEndDate": null,
    "spanishIndividualLessonsLastAttendedLessonDate": null,
    "spanishIndividualLessonsAssigned": 0,
    "spanishIndividualLessonsAttended": 0,
    "spanishIndividualLessonsAbsent": 0,
    "spanishIndividualLessonsLateCancellation": 0,
    "spanishIndividualLessonsRemainingScheduled": 0,
    "spanishIndividualLessonsRemainingUnscheduled": 0,
    "spanishLastOverallProficiencyLevel": "A2.3",
    "spanishLastOverallProficiencyLevelDate": "2024-05-13T14:26:17.181"
}] {code}
To reproduce this first issue, one can then place a query record with the 
following query
{code:java}
(SELECT employeeId, email, elearningStartDate, lastLoginDate,
       'en' AS lovCode, 'incomplete' AS trackingStatus,
       CAST(englishTrainingTime AS INTEGER) AS timeSpent,
       englishLastOverallProficiencyLevel AS proficiencyLevel
 FROM FLOWFILE
 WHERE CAST(englishTrainingTime AS INTEGER))
UNION ALL
(SELECT employeeId, email, elearningStartDate, lastLoginDate,
       'fr' AS lovCode, 'incomplete' AS trackingStatus,
       CAST(frenchTrainingTime AS INTEGER) AS timeSpent,
       frenchLastOverallProficiencyLevel AS proficiencyLevel
 FROM FLOWFILE
 WHERE CAST(frenchTrainingTime AS INTEGER) <> 0) {code}
As a reader you can  put a jsonreader (with inferred schema) and json writer 
with default settings

In the output I have is this
{code:java}
[ {
  "employeeId" : "1234",
  "email" : "[email protected]",
  "elearningStartDate" : "2024-08-25",
  "lastLoginDate" : "2024-12-31T06:00:00",
  "lovCode" : "en",
  "trackingStatus" : "incomplete",
  "timeSpent" : 3874,
  "proficiencyLevel" : null
}, {
  "employeeId" : "1234",
  "email" : "[email protected]",
  "elearningStartDate" : "2024-08-25",
  "lastLoginDate" : "2024-12-31T06:00:00",
  "lovCode" : "fr",
  "trackingStatus" : "incomplete",
  "timeSpent" : 0,
  "proficiencyLevel" : null
} ]{code}
instead of
{code:java}
[ {
  "candidateLogin" : "1234",
  "candidateEmail" : "[email protected]",
  "registrationDate" : "2024-11-18",
  "lastAccessDate" : "2024-11-18T10:00:00",
  "lovCode" : "fr",
  "trackingStatus" : "incomplete",
  "timeSpent" : 406,
  "score" : null
}, {
  "candidateLogin" : "1234",
  "candidateEmail" : "[email protected]",
  "registrationDate" : "2024-03-22",
  "lastAccessDate" : "2024-12-31T13:00:00",
  "lovCode" : "en",
  "trackingStatus" : "incomplete",
  "timeSpent" : 114828,
  "score" : "B2.1"
}]{code}
it's like the QueryRecord just takes the first record and copy it and displays 
it a second time. 

There are also some cases where the columns in output are not in the orders of 
the select query. 

Our queries didn't change between 1.28.0 and 2.4.0 but the output is simply 
wrong in some cases now. We thought at the begining that the issue could be 
related to the Apache Calcite version used but even with the latest one 1.39.0 
used in nifi 2.4.0 the problem remains. 

While running the same query on our own internal processor (Similar to the 
query record in terms of purpose but not code and uses the same apache calcite 
version 1.39.0) doesn't have the issue.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to