Pierre,

I will be happy to review a PR but I suspect this should be seen as a
breaking change.

Reason being we would be deviating from original schema and downstream
systems may need to have tables updated. My suggestion is to bump ParCEFone
and move all ambiguous int to bigint.

We can then absorb this in NiFI.

For reference:

I have faced similar deviations from the schema myself and handled them by
routing parsing failures to separate processors in a way similar to what
Lehel suggested.

My rationale was: these failures tend to be related to specific to vendors
that simply failed to implement the standard properly and a deployment
specific series of processors are better able to handle those (since they
would impact downstream systems like indexers, tables, columnar formats,
etc).

Why do I say this is vendor specific? Because in face of a data point that
is larger than 32 bit integer, they could have used custom fields like
`cn1` which are defined as Long in the standard for storing that data.

This is the same reason I disagree with what has been stated in the linked
Greylog issue: CEF has very clearly defined long and int fields. There
should be no ambiguity around the length of it, it can be clearly deducted
to be 32 bits.

That view is supported by the fact 64 bit integer fields have been defined
and clearly diferentiated as part of CEF 1.2.

Cheers

On Thu, 9 Nov 2023, 02:36 Pierre Villard, <[email protected]>
wrote:

> I may be able to submit a PR against ParseCEF as I did a few improvements
> in the past but not sure when I'll be able to get to it and how fast a new
> release would be made available for use in NiFi.
>
> Will try to block some time for this over the weekend.
>
> Le mer. 8 nov. 2023 à 16:22, <[email protected]> a écrit :
>
>> OK, sounds good, I will try it.
>>
>> Thank you
>> M.
>>
>> ---------- Původní e-mail ----------
>> Od: Lehel Boér <[email protected]>
>> Komu: [email protected] <[email protected]>
>> Datum: 8. 11. 2023 15:39:43
>> Předmět: Re: CEF parsing type error
>>
>> I can't see a good workaround for this. The problem is if you remove the
>> out=[integer] from the log message, the CEF format becomes invalid. After
>> finding a solution for this, I'd go with text manipulation with the
>> following processors:
>>
>>    - ReplaceText to remove the unwanted part
>>    - ExtractText to get the 'out' as a FlowFile attribute
>>    - UpdateAttribute to later update the FlowFile with the extracted
>>    attribute
>>
>> ------------------------------
>> *From:* [email protected] <[email protected]>
>> *Sent:* Wednesday, November 8, 2023 7:22
>> *To:* [email protected] <[email protected]>
>> *Subject:* Re: CEF parsing type error
>>
>> Hi,
>> I understand and thank you for the information, but how to solve this
>> problem in NiFi?
>>
>> Own Python script and extra parse failure output of CEF parser ?
>>
>> Marek
>>
>> P.S.
>> https://github.com/fluenda/ParCEFone/issues/30
>>
>>
>> ---------- Původní e-mail ----------
>> Od: Lehel Boér <[email protected]>
>> Komu: [email protected] <[email protected]>, [email protected] <
>> [email protected]>
>> Datum: 7. 11. 2023 22:22:33
>> Předmět: Re: CEF parsing type error
>>
>> Hi,
>>
>> The official implementation suggests to use Integer for the *out* key
>> although by definition
>> it can exceed the size of an integer.
>>
>>
>>    - out: bytesOut Integer Number of bytes transferred outbound relative
>>    to the source to destination relationship. For example, the byte number of
>>    data flowing from the destination to the source.
>>
>> This issue was also emerged with graylog here
>> <https://github.com/Graylog2/graylog2-server/issues/7371>. They even got
>> a reply from Fortinet indicating that the root cause of the issue was
>> that the official documentation of CEF did not specify integer range. Later
>> graylog updated their code to expand the range for bigger numerical
>> values.
>>
>> Best Regards,
>> Lehel
>> ------------------------------
>> *From:* Otto Fowler <[email protected]>
>> *Sent:* Tuesday, November 7, 2023 16:35
>> *To:* [email protected] <[email protected]>; [email protected] <
>> [email protected]>
>> *Subject:* Re: CEF parsing type error
>>
>> You should open an issue upstream :
>> https://github.com/fluenda/ParCEFone/issues
>>
>>
>> On November 7, 2023 at 9:47:06 AM, [email protected] ([email protected])
>> wrote:
>>
>> Hello, Im using CEFParser and I'm new to Nifi.
>>
>> I have a problem, sometimes a parser error occurs when the numberf is
>> exceeded Integer
>> Is there any way to solve it, for example by adding LONG type for the key
>> "out" somewhere and so on?
>>
>> Please
>> Kind Regards
>> Marek
>>
>> *### CEF Message example from Fortigate (Key: *out was an bigger than
>> Integer)* ### :*
>> <165>Oct 23 22:10:20 FGT-DEV-FW1 CEF:
>> 0|Fortinet|Fortigate|v7.0.12|00020|traffic:forward
>> accept|3|deviceExternalId=FGXXXXXXX012 FTNTFGTeventtime=1698091820252030526
>> FTNTFGTtz=+0200 FTNTFGTlogid=0000000020 cat=traffic:forward
>> FTNTFGTsubtype=forward FTNTFGTlevel=notice FTNTFGTvd=root src=172.37.1.1
>> spt=9004 deviceInboundInterface=VPN-DEV_Off-1 FTNTFGTsrcintfrole=undefined
>> dst=172.30.2.180 dpt=514 deviceOutboundInterface=741_CZ_Srv
>> FTNTFGTdstintfrole=lan FTNTFGTsrccountry=Reserved
>> FTNTFGTdstcountry=Reserved externalId=573022232 proto=17 act=accept
>> FTNTFGTpolicyid=527 FTNTFGTpolicytype=policy
>> FTNTFGTpoluuid=73816fb2-6720-51ec-c859-c84211230e24
>> FTNTFGTpolicyname=Office-2 app=udp/514 FTNTFGTtrandisp=noop
>> FTNTFGTduration=331878 out=3443586134 in=0 FTNTFGTsentpkt=3420478
>> FTNTFGTrcvdpkt=0 FTNTFGTvpntype=ipsecvpn FTNTFGTappcat=unscanned
>> FTNTFGTsentdelta=959006 FTNTFGTrcvddelta=0
>>
>> *### CEFParser type ERROR ### :*
>> 2023-10-23 20:10:18,127 INFO [FileSystemRepository Workers Thread-1]
>> o.a.n.c.repository.FileSystemRepository
>> <http://o.a.n.c.repository.filesystemrepository/> Successfully archived
>> 4 Resource Claims for Container default in 10 millis
>> 2023-10-23 20:10:21,003 ERROR [Timer-Driven Process Thread-4]
>> o.a.nifi.processors.standard.ParseCEF
>> <http://o.a.nifi.processors.standard.parsecef/> 
>> ParseCEF[id=100411d1-1e6d-12bc-5347-9553a96ec9a5]
>> CEF Parsing Failed:
>> StandardFlowFileRecord[uuid=6198fa4d-69a9-4a60-9062-21dff7a16a05,claim=StandardContentClaim
>> [resourceClaim=StandardResourceClaim[id=1698091820924-6175,
>> container=default, section=31], offset=13986,
>> length=911],offset=0,name=6198fa4d-69a9-4a60-9062-21dff7a16a05,size=911]
>> java.lang.NumberFormatException <http://java.lang.numberformatexception/>:
>> For input string: "3443586134"
>> at java.base/…own
>> <http://java.base/java.lang.NumberFormatException.forInputString(Unknown>
>>  Source)
>> at java.base/…own <http://java.base/java.lang.Integer.parseInt(Unknown>
>>  Source)
>> at java.base/…own <http://java.base/java.lang.Integer.valueOf(Unknown>
>>  Source)
>> at com.fluenda.parcefone.event.CefRev23.setExtension(CefRev23.java:660
>> <http://com.fluenda.parcefone.event.cefrev23.setextension%28cefrev23.java:660/>
>> )
>> at com.fluenda.parcefone.parser.CEFParser.parse(CEFParser.java:235
>> <http://com.fluenda.parcefone.parser.cefparser.parse%28cefparser.java:235/>
>> )
>> at com.fluenda.parcefone.parser.CEFParser.parse(CEFParser.java:109
>> <http://com.fluenda.parcefone.parser.cefparser.parse%28cefparser.java:109/>
>> )
>> at
>> org.apache.nifi.processors.standard.ParseCEF.onTrigger(ParseCEF.java:277
>> <http://org.apache.nifi.processors.standard.parsecef.ontrigger%28parsecef.java:277/>
>> )
>> at
>> org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27
>> <http://org.apache.nifi.processor.abstractprocessor.ontrigger%28abstractprocessor.java:27/>
>> )
>> at
>> org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1361
>> <http://org.apache.nifi.controller.standardprocessornode.ontrigger%28standardprocessornode.java:1361/>
>> )
>> at
>> org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:247
>> <http://org.apache.nifi.controller.tasks.connectabletask.invoke%28connectabletask.java:247/>
>> )
>> at
>> org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:102
>> <http://org.apache.nifi.controller.scheduling.timerdrivenschedulingagent%241.run%28timerdrivenschedulingagent.java:102/>
>> )
>> at org.apache.nifi.engine.FlowEngine$2.run(FlowEngine.java:110
>> <http://org.apache.nifi.engine.flowengine%242.run%28flowengine.java:110/>
>> )
>> at java.base/…own
>> <http://java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown>
>>  Source)
>> at java.base/…own
>> <http://java.base/java.util.concurrent.FutureTask.runAndReset(Unknown>
>>  Source)
>> at java.base/…own
>> <http://java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown>
>>  Source)
>> at java.base/…own
>> <http://java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown>
>>  Source)
>> at java.base/…own
>> <http://java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown>
>>  Source)
>> at java.base/…own <http://java.base/java.lang.Thread.run(Unknown> Source)
>>
>>

Reply via email to