Re: Issue with QueryRecord failing when data is missing

Jens M. Kofoed Thu, 25 Feb 2021 12:51:36 -0800

Hi Matt

Many thanks for you replay. Yes I use the default which is the
infer-schema. And it makes sense that the infer-schema can't guess what
type the fields are if there are no data.
So if I search for a boolean value and the infer-schema set it to text, it
makes sense it produce errors. And why did I not think of it when I sat and
tore my hair out of my head :-)


As a workaround I have used a ValidateRecord and a RouteOnAttribute using
the record.count.
So many thanks for the answer.

Kind regards
Jens M. Kofoed

Den tor. 25. feb. 2021 kl. 16.28 skrev Matt Burgess <[email protected]>:

> Jens,
>
> What is the Schema Access Strategy set to in your CSVReader? If "Infer
> Schema" or "Use String Fields From Header", the setting of "Treat
> First Line As Header" should be ignored as those two options require a
> header be present anyway. If you know the schema ahead of time you
> could set it in the CSVReader rather than inferring it.
>
> For "Infer Schema", there's a bug where the inferred schema is empty
> because we don't have any records from which to infer the types of the
> fields (even though the field names are present). I wrote up NIFI-8259
> [1] to infer the types as strings when no records are present.
>
> As a workaround you could filter out any FlowFiles that have no
> records, either by using CountText or the 'record.count' attribute if
> it has been set, into a RouteOnAttribute. Alternatively you could
> emulate what NIFI-8259 is going to do by using "Use String Fields From
> Header" in your CSVReader, but in that case you might need a CAST(colC
> as BOOLEAN) in your SQL since populated FlowFiles could have the
> correctly inferred schema where empty FlowFiles (or if "Use String
> Fields From Header" is set) will think colC is a string rather than a
> boolean. The CAST should work in both cases but I didn't try it.
>
> Regards,
> Matt
>
> [1] https://issues.apache.org/jira/browse/NIFI-8259
>
> On Thu, Feb 25, 2021 at 1:56 AM Jens M. Kofoed <[email protected]>
> wrote:
> >
> > Hi all
> >
> > I have a issue with using the QueryRecord query csv files. currently i'm
> running NiFi version 1.12.1 but I also tested this in version 1.13.0
> > If my incoming csv file only have a header line and no data it fails
> >
> > My querying statement looks like this: SELECT colA FROM FLOWFILE WHERE
> colC = 'true'
> >
> > Changes made to the CSVReader:
> > Treat Firs Line as Header = true
> >
> > Changes made to the CSVRecordSetWriter:
> > Include Header Line = false
> > Record Separator = ,
> >
> > Here are 2 sample data. The first one works as expected, but sample 2
> gives errors
> > Sample 1:
> > colA,colB,colC
> > data1A,data1B,true
> > data2A,data2B,false
> > data3A,data3B,true
> >
> > Outcome: data1A,data3A,
> >
> > Sample 2:
> > colA,colB,colC
> >
> > Error message:
> > QueryRecord[id=d7c38f75-0177-1000-ffff-fffff694dd96] Unable to query
> StandardFlowFileRecord[uuid=74a71c6e-3d3f-406c-92af-c9e4e27d6d69,claim=StandardContentClaim
> [resourceClaim=StandardResourceClaim[id=1614232293848-3,
> container=Node01Cont01, section=3], offset=463,
> length=14],offset=0,name=74a71c6e-3d3f-406c-92af-c9e4e27d6d69,size=14] due
> to java.sql.SQLException: Error while preparing statement [SELECT colA FROM
> FLOWFILE WHERE colC = true]:
> org.apache.nifi.processor.exception.ProcessException:
> java.sql.SQLException: Error while preparing statement [SELECT colA FROM
> FLOWFILE WHERE colC = true]
> >
> > Is this a bug?
> >
> > kind regards
> > Jens M. Kofoed
>

Re: Issue with QueryRecord failing when data is missing

Reply via email to