Hi Stephen,

In my case the empty values were filled with a string like “EMPTY” so I could 
do something like the following;

{
    "operation": "modify-overwrite-beta",
    "spec": {
      "data": {
        values": {
          "*": {
            "value": ["=toDouble", null]
          }
        }
      }
    }
  }

EMPTY cannot be cast to double so it gets filled with null.

If you have an empty string instead of something like “EMPTY” and the above 
Jolt transformation does not appear to work, you can use SplitJson, 
EvaluateJsonPath, save the result in an attribute, UpdateAttribute in which you 
can check whether the size of the attribute is 0 (something like 
${att_name:length():gt(0)}) and if that is the case, replace it with a string 
like “EMPTY” using ifElse or you can try replaceEmpty. Next you can use the 
JoltTransform on the record to get the string “EMPTY” in there by using 
${att_name} in the Jolt transformation. After that you can use 
modify-overwrite-beta to replace it with null like shown above. There might be 
an easier solution though.

Hope this helps.

With kind regards,
Maarten Smeets

From: stephen.hindmarch.bt.com via users <[email protected]>
Sent: Friday, 15 July 2022 13:13
To: [email protected]
Subject: Update records with literal null, true or false values.

Hi all,

I have been looking at a case where some records have all fields presented as 
strings, and I need to turn the numeric or boolean values into their native 
types. I can do most of this with Jolt, but in the case where the value is 
missing I have a problem.

Say I have these records.

[
  
{"latitude":"1.0","longitude":"-1.0","user":{"name":"alice","id":"12345671","has_cover":"true"},"vehicle":{"id":"AB123DE"}},
  
{"latitude":"1.0","longitude":"-1.0","user":{"name":"bob","id":"12345672","has_cover":"false"},"vehicle":{"id":"AB123DE"}},
  
{"latitude":"","longitude":"","user":{"name":"chuck","id":"","has_cover":"flargh"},"vehicle":{"id":""}}
]

I can use “modify-overwrite” to turn the coordinates into doubles, the Booleans 
into true/false, and the user ID into a numeric. But this fails for Chuck’s 
record as Jolt ignores the empty string or none-truthy strings. The result I 
get is like this.

[
  
{"latitude":1.0,"longitude":-1.0,"user":{"name":"alice","id":12345671,"has_cover":true},"vehicle":{"id":"AB123DE"}},
  
{"latitude":1.0,"longitude":-1.0,"user":{"name":"bob","id":12345672,"has_cover":false},"vehicle":{"id":"AB123DE"}},
  
{"latitude":"","longitude":"","user":{"name":"chuck","id":"","has_cover":"flargh"},"vehicle":{"id":""}}
]

But what I really want, in order to conform to my Avro schema, is more like 
this.

[
  
{"latitude":1.0,"longitude":-1.0,"user":{"name":"alice","id":12345671,"has_cover":true},"vehicle":{"id":"AB123DE"}},
  
{"latitude":1.0,"longitude":-1.0,"user":{"name":"bob","id":12345672,"has_cover":false},"vehicle":{"id":"AB123DE"}},
  
{"latitude":null,"longitude":null,"user":{"name":"chuck","id":null,"has_cover":false},"vehicle":{"id":""}}
]

I looked at UpdateRecord and EvaluteJSONPath, but I cannot see a way to return 
a literal null, true or false. I have resorted to using some ReplaceTexts which 
can find and replace some of the errant values, but struggles with 
distinguishing between the user ID, which has to be numeric, and the vehicle 
ID, which needs to stay as a string. And global find and replace on text seems 
like a coarse instrument when the content is already neatly in records.

Can anyone suggest a better solution?

Thanks.

Steve Hindmarch

Reply via email to