Richard, It sounds like your scripted reader is responsible for parsing the Avro? In short, the Record appears to have an Avro Utf8 value, not a String, in the field you’re looking at. You could call .toString() on that Utf8 object, or you could configure the Avro reader to return Strings instead of Utf8 objects.
Thanks -Mark On Feb 15, 2024, at 5:52 AM, Richard Beare <[email protected]> wrote: Hi, This is a test pipeline reading pdf files from disk. It begins with a GetFile processor supplying a ConvertRecord processor with a scripted reader input and an avrorecordsetwriter, generic output. The scripted reader places the file content in a "content" field: List<RecordField> recordFields = [] recordFields.add(new RecordField("content", RecordFieldType.ARRAY.getArrayDataType(RecordFieldType.BYTE.getDataType()))) schema = new SimpleRecordSchema(recordFields) This bit seems OK. Next step is update record which adds other fields to mimic the real case of pulling out of a DB - Age, gender etc, all of which are dummies and a timestamp based on the filename by the following expression language: ${filename:substringBeforeLast('.'):substringAfterLast('_'):toDate('yyyyMMdd'):format("yyyy-MM-dd HH:mm:ss")} If I explicitly set the schema for the record writer to include {"name":"Visit_DateTime","type": {"type" : "long", "logicalType" : "timestamp-millis"}}, then I can get the following converter, a groovy script, which converts to json for transmission to an web service, to deal with the dates as follows: Date VisitTimeValue = null VisitTimeValue = new Date(currRecord.get(TimeStampFieldName)) I guess I thought this approach was overly complex. Given that I'm using Date functions in the expression language I hoped that the generic avro writer would correctly infer the schema so that I didn't have to explicitly provide one. Is this approach the right one? Is there a way I can isolate the expectation of a date component inside the groovy file only? I hope this is clear. Thanks for your help. On Thu, Feb 15, 2024 at 9:38 AM Mark Payne <[email protected]<mailto:[email protected]>> wrote: Hey Richard, I think you’d need to explain more about what you’re doing in your groovy script. What processor are you using? What’s the script doing? Is it parsing Avro data? On Jan 29, 2024, at 12:26 AM, Richard Beare <[email protected]<mailto:[email protected]>> wrote: Anyone able to offer assistance with this? I think my problem relates to correctly specifying types using expression languages and using schema inference from groovy. On Tue, Jan 23, 2024 at 2:20 PM Richard Beare <[email protected]<mailto:[email protected]>> wrote: Hi, What is the right way to deal with dates in the following context. I'm using the updaterecord processor to add a datestamp field to a record (derived from a filename attribute inserted by the getfile processor). /Visit_DateTime. ${filename:substringBeforeLast('.'):substringAfterLast('_'):toDate('yyyyMMdd'):format('yyyy-MM-dd'T'HH:mm:ss'Z'") Inside the groovy script I'm attempting to convert to date as follows: VisitTimeValue = new Date(currRecord.get(Visit_DateTime as String)) However I always get messages about "could not find matching constructor for java.util.Date(org.apackge.avro.util.Utf8)" I have a previously working version, from a slightly different context which did a cast to long: Date((long)currRecord.get....). In that case the record was created by a database query. The eventual use of VisitTimeValue is to dump it into a flowfile attribute. It seems to me that the type of the date field is not being correctly inferred by the avro reader/writers after I create it with the expression language. Alternatively, perhaps I should be using different date handling tools inside groovy. All advice welcome. Thanks
