On April 25, 2019 at 12:05:45, Nick Allen ([email protected]) wrote:

> Otto: I’m not sure this is an enveloped issue, or a new feature for the
json map parser

This is not an issue with JSONMapParser.  This is an issue with the
"enveloping" mechanism, prior to when the JSONMapParser gets the message.

The entire message has been parsed as a JSON object including the value of
the "_source" field.  Since the "_source" field itself contains valid JSON,
the parser transformed it into a Map, rather than the String that it
expects.

In my opinion, the ENVELOPE strategy needs to not parse the contents of
that "_source" field.  The ENVELOPE strategy should work for JSON and
non-JSON content alike.


The envelope doesn’t parse the contents of the source field, it just was
written to support string fields as source and not json objects.
The source field is parsed because the json it is in is parsed.

This is a design limitation of the envelope functionality, not a bug.

The issue _may_ be one that can be addressed in the JSONMap parser or
transformations if the use case is not really an envelope use case, but one
where they are using envelope to ‘get at’ the inner json to transform it or
something, maybe.  I don’t mean to say this is a bug in JSONMap either



On Thu, Apr 25, 2019 at 11:31 AM Otto Fowler <[email protected]>
wrote:

> I’m not sure about the name, I’m more thinking about the case.
> I’m not sure this is an enveloped issue, or a new feature for the json map
> parser ( or if you could do it with the jsonMap parser and JSONPath )
>
>
>
> On April 25, 2019 at 11:23:25, Simon Elliston Ball (
> [email protected]) wrote:
>
> Seems like this would a good additional strategy, something like
> ENVELOPE_PARSED? Any thoughts on a good name?
>
> On Thu, 25 Apr 2019 at 16:20, Otto Fowler <[email protected]> wrote:
>
>> So,  the enveloped message doesn’t support getting an already parsed json
>> object from the enveloped json, we would have to do some work to support
>> this,  Even if we _could_ wrangle it in there now, from what I can see we
>> would still  have to serialize to bytes to pass to the actual parser and
>> that would be inefficient.
>> Can you open a jira with the information you provided?
>>
>>
>>
>> On April 25, 2019 at 11:12:38, Otto Fowler ([email protected])
>> wrote:
>>
>> Raw message in this case assumes that the raw message is a String
>> embedded in the json field that you supply, not a nested json object, so it
>> is looking for
>>
>>
>> “_source” : “some other embedded string of some format like syslog in
>> json”
>>
>> There are other message strategies, but I’m not sure they would work in
>> this instance.  I’ll keep looking. hopefully someone more familiar will
>> jump in.
>>
>>
>> On April 25, 2019 at 10:48:06, [email protected] (
>> [email protected]) wrote:
>>
>> Hello,
>>
>>
>>
>> I’m trying to load some JSON data which has the following structure (this
>> is a sample):
>>
>>
>>
>> {
>>
>>   "_index": "indexing",
>>
>>   "_type": "Event",
>>
>>   "_id": "AWAkTAefYn0uCUpkHmCy",
>>
>>   "_score": 1,
>>
>>   "_source": {
>>
>>     "dst": "127.0.0.1",
>>
>>     "devTimeEpoch": "1512437340000",
>>
>>     "dstPort": "0",
>>
>>     "srcPort": "80",
>>
>>     "src": "194.51.198.185"
>>
>>   }
>>
>> }
>>
>>
>>
>> In my file, everything is on the same line. My parser config is the
>> following:
>>
>>
>>
>> {
>>
>>   "parserClassName": "org.apache.metron.parsers.json.JSONMapParser",
>>
>>   "filterClassName": null,
>>
>>   "sensorTopic": "my_topic",
>>
>>   "outputTopic": null,
>>
>>   "errorTopic": null,
>>
>>   "writerClassName": null,
>>
>>   "errorWriterClassName": null,
>>
>>   "readMetadata": true,
>>
>>   "mergeMetadata": true,
>>
>>   "numWorkers": 2,
>>
>>   "numAckers": null,
>>
>>   "spoutParallelism": 1,
>>
>>   "spoutNumTasks": 1,
>>
>>   "parserParallelism": 2,
>>
>>   "parserNumTasks": 2,
>>
>>   "errorWriterParallelism": 1,
>>
>>   "errorWriterNumTasks": 1,
>>
>>   "spoutConfig": {},
>>
>>   "securityProtocol": null,
>>
>>   "stormConfig": {},
>>
>>   "parserConfig": {
>>
>>   },
>>
>>   "fieldTransformations": [
>>
>>    {
>>
>>      "transformation":"RENAME",
>>
>>      "config": {
>>
>>         "dst": "ip_dst_addr",
>>
>>         "src": "ip_src_addr",
>>
>>         "srcPort": "ip_src_port",
>>
>>         "dstPort": "ip_dst_port",
>>
>>         "devTimeEpoch": "timestamp"
>>
>>      }
>>
>>    }
>>
>>   ],
>>
>>   "cacheConfig": {},
>>
>>   "rawMessageStrategy": "ENVELOPE",
>>
>>   "rawMessageStrategyConfig": {
>>
>>     "messageField": "_source"
>>
>>   }
>>
>> }
>>
>>
>>
>> But in Storm I get the following errors:
>>
>>
>>
>> 2019-04-25 16:45:22.225 o.a.s.d.executor Thread-5-parserBolt-executor[8
>> 8] [ERROR]
>>
>> java.lang.ClassCastException: java.util.LinkedHashMap cannot be cast to
>> java.lang.String
>>
>>         at
>> org.apache.metron.common.message.metadata.EnvelopedRawMessageStrategy.get(EnvelopedRawMessageStrategy.java:78)
>> ~[stormjar.jar:?]
>>
>>         at
>> org.apache.metron.common.message.metadata.RawMessageStrategies.get(RawMessageStrategies.java:54)
>> ~[stormjar.jar:?]
>>
>>         at
>> org.apache.metron.common.message.metadata.RawMessageUtil.getRawMessage(RawMessageUtil.java:55)
>> ~[stormjar.jar:?]
>>
>>         at
>> org.apache.metron.parsers.bolt.ParserBolt.execute(ParserBolt.java:251)
>> [stormjar.jar:?]
>>
>>         at
>> org.apache.storm.daemon.executor$fn__10195$tuple_action_fn__10197.invoke(executor.clj:735)
>> [storm-core-1.1.0.2.6.5.1050-37.jar:1.1.0.2.6.5.1050-37]
>>
>>         at
>> org.apache.storm.daemon.executor$mk_task_receiver$fn__10114.invoke(executor.clj:466)
>> [storm-core-1.1.0.2.6.5.1050-37.jar:1.1.0.2.6.5.1050-37]
>>
>>         at
>> org.apache.storm.disruptor$clojure_handler$reify__4137.onEvent(disruptor.clj:40)
>> [storm-core-1.1.0.2.6.5.1050-37.jar:1.1.0.2.6.5.1050-37]
>>
>>         at
>> org.apache.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:472)
>> [storm-core-1.1.0.2.6.5.1050-37.jar:1.1.0.2.6.5.1050-37]
>>
>>         at
>> org.apache.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:451)
>> [storm-core-1.1.0.2.6.5.1050-37.jar:1.1.0.2.6.5.1050-37]
>>
>>         at
>> org.apache.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:73)
>> [storm-core-1.1.0.2.6.5.1050-37.jar:1.1.0.2.6.5.1050-37]
>>
>>         at
>> org.apache.storm.daemon.executor$fn__10195$fn__10208$fn__10263.invoke(executor.clj:855)
>> [storm-core-1.1.0.2.6.5.1050-37.jar:1.1.0.2.6.5.1050-37]
>>
>>         at org.apache.storm.util$async_loop$fn__1221.invoke(util.clj:484)
>> [storm-core-1.1.0.2.6.5.1050-37.jar:1.1.0.2.6.5.1050-37]
>>
>>         at clojure.lang.AFn.run(AFn.java:22) [clojure-1.7.0.jar:?]
>>
>>         at java.lang.Thread.run(Thread.java:745) [?:1.8.0_112]
>>
>>
>>
>>
>>
>> How can I debug this?
>>
>>
>>
>> Thanks
>>
>>
>>
>> Stéphane
>>
>> _________________________________________________________________________________________________________________________
>>
>> Ce message et ses pieces jointes peuvent contenir des informations 
>> confidentielles ou privilegiees et ne doivent donc
>> pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu 
>> ce message par erreur, veuillez le signaler
>> a l'expediteur et le detruire ainsi que les pieces jointes. Les messages 
>> electroniques etant susceptibles d'alteration,
>> Orange decline toute responsabilite si ce message a ete altere, deforme ou 
>> falsifie. Merci.
>>
>> This message and its attachments may contain confidential or privileged 
>> information that may be protected by law;
>> they should not be distributed, used or copied without authorisation.
>> If you have received this email in error, please notify the sender and 
>> delete this message and its attachments.
>> As emails may be altered, Orange is not liable for messages that have been 
>> modified, changed or falsified.
>> Thank you.
>>
>>
>
> --
> --
> simon elliston ball
> @sireb
>
>

Reply via email to